Rush has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/403072 )
Change subject: toolforge: ferm hook to restart components post updates ...................................................................... toolforge: ferm hook to restart components post updates http://ferm.foo-projects.org/download/2.1/ferm.html#hooks * Ferm is not playing nice with other iptables tenants * Tested an /etc/ferm/conf.d/00_hooks to see it run external scripts in what seems like a totally post updates state. This hopefully allows kube-proxy, flannel, and docker to deal with Ferm stomping all around. This is a midterm fix where other options are being explored in the context of the task. Right now any update to Ferm, even a definition MAC, results in an outage for k8s in Toolforge. Note - the task includes restarting kubelet in the service handling post ferm update to mitigate. This seems not to be necessary so is excluded to reduce to necessary scope. Bug: T182722 Change-Id: I5c700a2c8bce6050e8cb761450d3716a6b3f33c9 --- M modules/role/manifests/toollabs/k8s/master.pp M modules/role/manifests/toollabs/proxy.pp A modules/toollabs/files/ferm_restart_handler.sh A modules/toollabs/manifests/ferm_restart_handler.pp M modules/toollabs/manifests/proxy.pp 5 files changed, 45 insertions(+), 2 deletions(-) Approvals: Arturo Borrero Gonzalez: Looks good to me, but someone else must approve Rush: Verified; Looks good to me, approved diff --git a/modules/role/manifests/toollabs/k8s/master.pp b/modules/role/manifests/toollabs/k8s/master.pp index 81647b4..1c0d78c 100644 --- a/modules/role/manifests/toollabs/k8s/master.pp +++ b/modules/role/manifests/toollabs/k8s/master.pp @@ -2,8 +2,9 @@ class role::toollabs::k8s::master( $use_puppet_certs = false, ) { - include ::base::firewall include ::toollabs::infrastructure + include ::base::firewall + include ::toollabs::ferm_restart_handler $master_host = hiera('k8s::master_host', $::fqdn) $etcd_url = prefix(suffix(hiera('k8s::etcd_hosts'), ':2379'), 'https://') diff --git a/modules/role/manifests/toollabs/proxy.pp b/modules/role/manifests/toollabs/proxy.pp index be70d49..c82cfef 100644 --- a/modules/role/manifests/toollabs/proxy.pp +++ b/modules/role/manifests/toollabs/proxy.pp @@ -2,6 +2,8 @@ class role::toollabs::proxy { include ::toollabs::proxy include ::role::toollabs::k8s::webproxy + include ::base::firewall + include ::toollabs::ferm_restart_handler ferm::service { 'proxymanager': proto => 'tcp', diff --git a/modules/toollabs/files/ferm_restart_handler.sh b/modules/toollabs/files/ferm_restart_handler.sh new file mode 100644 index 0000000..5581387 --- /dev/null +++ b/modules/toollabs/files/ferm_restart_handler.sh @@ -0,0 +1,21 @@ +#/bin/bash + +if [[ $EUID -ne 0 ]]; then + echo "This script must be run as root" 1>&2 + exit 1 +fi + +/usr/bin/logger -i -t ${0} "restart firewall components post ferm management" + +# Ferm expects to handle all firewall state +# and that does not mesh well with dynamic chain management. +# We tell the k8s stack here to restart +# +# This should be no more invasive than a rescheduling +# of a POD to another worker. +# +# If we are living in an nftables world when you read +# this, then this should be totally rethought. +service docker restart +service flannel restart +service kube-proxy restart diff --git a/modules/toollabs/manifests/ferm_restart_handler.pp b/modules/toollabs/manifests/ferm_restart_handler.pp new file mode 100644 index 0000000..58a4437 --- /dev/null +++ b/modules/toollabs/manifests/ferm_restart_handler.pp @@ -0,0 +1,20 @@ +# tldr; hook post ferm updates to let other interested +# parties resync their iptables state. +# See: T182722 +class toollabs::ferm_restart_handler{ + + file {'/usr/local/sbin/ferm_restart_handler': + source => 'puppet:///modules/toollabs/ferm_restart_handler.sh', + owner => 'root', + group => 'root', + mode => '0555', + } + + # http://ferm.foo-projects.org/download/2.1/ferm.html#hooks + # https://phabricator.wikimedia.org/T182722 + ferm::conf{'ferm_restart_handler': + prio => 00, + content => '@hook post "/usr/local/sbin/ferm_restart_handler";', + subscribe => File['/usr/local/sbin/ferm_restart_handler'], + } +} diff --git a/modules/toollabs/manifests/proxy.pp b/modules/toollabs/manifests/proxy.pp index 63953dd..9befba2 100644 --- a/modules/toollabs/manifests/proxy.pp +++ b/modules/toollabs/manifests/proxy.pp @@ -9,7 +9,6 @@ include ::toollabs::infrastructure include ::redis::client::python - include ::base::firewall if $ssl_install_certificate { sslcert::certificate { $ssl_certificate_name: -- To view, visit https://gerrit.wikimedia.org/r/403072 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I5c700a2c8bce6050e8cb761450d3716a6b3f33c9 Gerrit-PatchSet: 9 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Rush <r...@wikimedia.org> Gerrit-Reviewer: Andrew Bogott <abog...@wikimedia.org> Gerrit-Reviewer: Arturo Borrero Gonzalez <aborr...@wikimedia.org> Gerrit-Reviewer: BryanDavis <bda...@wikimedia.org> Gerrit-Reviewer: Giuseppe Lavagetto <glavage...@wikimedia.org> Gerrit-Reviewer: Merlijn van Deen <valhall...@arctus.nl> Gerrit-Reviewer: Rush <r...@wikimedia.org> Gerrit-Reviewer: jenkins-bot <> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits