** Description changed:

  [Impact]
- CONFIG_NUMA_BALANCING and CONFIG_NUMA_BALANCING_DEFAULT_ENABLED were both set 
to =y in hwe-x/hwe-y. This changed to =n in hwe-z, unintentionally as far as I 
can tell. This can lead to performance degradation on NUMA-based arm64 systems 
when processes migrate, and their memory accesses now suffer additional latency.
+ CONFIG_NUMA_BALANCING and CONFIG_NUMA_BALANCING_DEFAULT_ENABLED were both set 
to =y in hwe-x/hwe-y. This changed to =n in hwe-z, unintentionally as far as I 
can tell. This can lead to performance degradation on NUMA-based arm64 systems 
when threads migrate, and their memory accesses now suffer additional latency.
  
  [Test Case]
  At a functional level:
  
- test -f /proc/sys/kernel/numabalancing
+ $ test -f /proc/sys/kernel/numabalancing
  
- Performance?
+ Performance:
+ 
+ $ perf bench numa -a
+ I didn't see any significant changes in the RAM-bw tests (expected).
+ For the convergence tests, I observed the following results, which appear to 
be all within reasonable variance.
+ 
+ Test     | Balancing=n | Balancing=y
+ -------------------------------------
+ 1x3      | No-Converge | No-Converge
+ 1x4      | No-Converge | 0.576s
+ 1x6      | No-Converge | No-Converge
+ 2x3      | No-Converge | No-Converge
+ 3x3      | No-Converge | No-Converge
+ 4x4      | No-Converge | No-Converge
+ 4x4-NOTHP| No-Converge | No-Converge
+ 4x6      | No-Converge | No-Converge
+ 4x8      | No-Converge | No-Converge
+ 8x4      | No-Converge | No-Converge
+ 8x4-NOTHP| No-Converge | No-Converge
+ 3x1      | 0.848s      | 1.212s
+ 4x1      | 0.832s      | 0.712s
+ 8x1      | 0.792s      | 0.649s
+ 16x1     | 1.511s      | 1.485s
+ 32x1     | 0.750s      | 0.899s
+ 
+ Finally, for the bw tests, I see significant improvements across the board:
+ Test      | BW Improvement
+ -------------------------
+ ======= Process =========
+ 2x1       |   2.2%
+ 3x1       |  61.4%
+ 4x1       |  25.0%
+ 8x1       | 104.6%
+ 8x1-NOTHP | 107.6%
+ 16x1      | 200.9%
+ ======= Thread ==========
+ 4x1       |  10.9%
+ 8x1       | 107.4%
+ 16x1      | 230.7%
+ 32x1      | 239.7%
+ 2x3       |  13.5%
+ 4x4       |  69.2%
+ 4x6       |  84.4%
+ 4x8       |  79.7%
+ 4x8-NOTHP | 152.5%
+ 3x3       |  96.1%
+ 5x5       | 150.2%
+ 2x16      | 122.6%
+ 1x32      |  40.5%
  
  [Regression Risk]
+ This is changing a config only on arm64, so the regression risk will be 
limited to those platforms. The code we will be enabling on arm64 is already 
enabled on other architectures (!s390x), so has been tested within Ubuntu zesty 
already. This was previous also enabled on arm64 in hwe-x/hwe-y, so we can gain 
some confidence from that.
+ 
+ There is certainly a possibility that this negatively impacts
+ performance for certain workloads on NUMA/arm64 systems. If that occurs,
+ there is a sysctl that can be used to disable this feature.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1690914

Title:
  [Regression] NUMA_BALANCING disabled on arm64

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1690914/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to