BBlack has submitted this change and it was merged. (
https://gerrit.wikimedia.org/r/362438 )
Change subject: numa_networking: new mode "isolate"
......................................................................
numa_networking: new mode "isolate"
$numa_networking now has three modes: off, on, and isolate.
"on" - is what we had before this patch, and tries to set up NUMA
affinity for network-related processes and interrupts, but doesn't
touch the rest of the system.
"isolate" - Tries to also, to the degree possible, exclude
everything else from the NUMA networking node. Kernel commandline
params will isolate all tasks away from the node (except those
explicitly placed there via tools like cset, taskset, or numactl),
disk writeback traffic will be moved away from the node, VM
settings will avoid cross-node memory allocations, etc. This is
a much stricter split of resources.
Also, the $numa_networking global now overrides hiera
configuration back to "off" if facter detects no true NUMA
hardware (<2 nodes). This simplifies cases like virtuals while
allowing the hieradata setting to be set broadly for machine
profiles/roles.
Change-Id: I11027be1b9bcb66bf82dba0cf69c9c034a1d114e
---
M hieradata/hosts/cp4021.yaml
M manifests/realm.pp
M modules/interface/manifests/rps/modparams.pp
M modules/interface/templates/interface-rps-config.erb
M modules/profile/manifests/base.pp
M modules/tlsproxy/manifests/instance.pp
6 files changed, 35 insertions(+), 5 deletions(-)
Approvals:
BBlack: Looks good to me, approved
jenkins-bot: Verified
diff --git a/hieradata/hosts/cp4021.yaml b/hieradata/hosts/cp4021.yaml
index 84968de..4c422e3 100644
--- a/hieradata/hosts/cp4021.yaml
+++ b/hieradata/hosts/cp4021.yaml
@@ -1,2 +1,2 @@
bbr_congestion_control: true
-numa_networking: true
+numa_networking: isolate
diff --git a/manifests/realm.pp b/manifests/realm.pp
index d15e255..21889d1 100644
--- a/manifests/realm.pp
+++ b/manifests/realm.pp
@@ -68,7 +68,20 @@
}
# Hiera->Global to configure various classes for NUMA-aware networking
-$numa_networking = hiera('numa_networking', false)
+# 3 possible values:
+# --
+# off: default, no NUMA awareness
+# on: try confine network stuff to the NUMA node of the adapter
+# isolate: also exclude all other tasks from the NUMA node of the adapter
+# --
+# If facter detects no true NUMA (single-node), the hiera-configured setting
+# will be forced to "off" here in the global
+if size($facts['numa']['nodes']) > 1 {
+ $numa_networking = hiera('numa_networking', 'off')
+}
+else {
+ $numa_networking = 'off'
+}
# TODO: create hash of all LVS service IPs
diff --git a/modules/interface/manifests/rps/modparams.pp
b/modules/interface/manifests/rps/modparams.pp
index a5fc7d0..b020730 100644
--- a/modules/interface/manifests/rps/modparams.pp
+++ b/modules/interface/manifests/rps/modparams.pp
@@ -1,7 +1,7 @@
class interface::rps::modparams {
include initramfs
- if $::numa_networking {
+ if $::numa_networking != 'off' {
# note this assumes if bnx2x queue counts matter at all, that the
# primary interface is bnx2x. This is true for current cases, but may
# need to evolve later for hosts with multiple interfaces with distinct
diff --git a/modules/interface/templates/interface-rps-config.erb
b/modules/interface/templates/interface-rps-config.erb
index a1c9e80..3c6892e 100644
--- a/modules/interface/templates/interface-rps-config.erb
+++ b/modules/interface/templates/interface-rps-config.erb
@@ -1,4 +1,4 @@
[Options]
<% if @rss_pattern != '' %>rss_pattern = <%= @rss_pattern %><% end %>
<% if @qdisc != '' %>qdisc = <%= @qdisc %><% end %>
-<%- if @numa_networking %>numa_filter = yes<% end -%>
+<%- if @numa_networking != 'off' %>numa_filter = yes<% end -%>
diff --git a/modules/profile/manifests/base.pp
b/modules/profile/manifests/base.pp
index 47c6489..b973b5b 100644
--- a/modules/profile/manifests/base.pp
+++ b/modules/profile/manifests/base.pp
@@ -108,4 +108,21 @@
source => 'puppet:///modules/base/logrotate/upstart',
}
}
+
+ if $::numa_networking == 'isolate' {
+ grub::bootparam { 'isolcpus':
+ value =>
join(sort(flatten($facts['numa']['device_to_htset'][$facts['interface_primary']])),
',')
+ }
+
+ sysctl::parameters { 'numa_isolation':
+ values => { 'vm.zone_reclaim_mode' => 7 },
+ }
+
+ sysfs::parameters { 'cache_numa_isolate':
+ values => {
+ 'bus/workqueue/devices/writeback/numa' => 0,
+ 'bus/workqueue/devices/writeback/cpumask' =>
$facts['numa']['device_to_cpumask_invert'][$facts['interface_primary']],
+ }
+ }
+ }
}
diff --git a/modules/tlsproxy/manifests/instance.pp
b/modules/tlsproxy/manifests/instance.pp
index 959255d..57a7e11 100644
--- a/modules/tlsproxy/manifests/instance.pp
+++ b/modules/tlsproxy/manifests/instance.pp
@@ -27,7 +27,7 @@
# otherwise use 'lo' for this purpose. Assumes NUMA data has "lo"
interface
# mapped to all cpu cores in the non-NUMA case. The numa_iface variable is
# in turn consumed by the systemd unit and config templates.
- if $::numa_networking {
+ if $::numa_networking != 'off' {
$numa_iface = $facts['interface_primary']
} else {
$numa_iface = 'lo'
--
To view, visit https://gerrit.wikimedia.org/r/362438
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I11027be1b9bcb66bf82dba0cf69c9c034a1d114e
Gerrit-PatchSet: 4
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: BBlack <[email protected]>
Gerrit-Reviewer: BBlack <[email protected]>
Gerrit-Reviewer: Ema <[email protected]>
Gerrit-Reviewer: Giuseppe Lavagetto <[email protected]>
Gerrit-Reviewer: jenkins-bot <>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits