https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274994
Bug ID: 274994
Summary: Regression of iperf3 network throughput tests with
erms "rep movsb" copyto loops
Product: Base System
Version: CURRENT
Hardware: amd64
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: kern
Assignee: [email protected]
Reporter: [email protected]
Initially, the thinking was, that a missing patch 6210ac95a194 was responsible,
but even with
that patch, the below results are reproducable;
The exclusion of the erms codepath was performed by patching copyout.c like so
- thus only
the copyin smap_std or copyin_nosmap_std would be used; however, smap is
disabled elsewhere too,
thus this collapses to a comparison between copyin_nosmap_std and
copyin_nosmap_erms:
> switch (cpu_stdext_feature & (CPUID_STDEXT_SMAP | CPUID_STDEXT_ERMS))
> {
> case CPUID_STDEXT_SMAP:
> return (copyin_smap_std);
> + #if 0
> case CPUID_STDEXT_ERMS:
> return (copyin_nosmap_erms);
> case CPUID_STDEXT_SMAP | CPUID_STDEXT_ERMS:
> return (copyin_smap_erms);
> + #else
> + case CPUID_STDEXT_SMAP | CPUID_STDEXT_ERMS:
> + return (copyin_smap_std);
> #endif
> default:
> return (copyin_nosmap_std);
> }
On Sapphire Rapids
Xeon Gold 5416S (processor ID 0x806f7) (4th Gen)
with erms "rep movsb":
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 4] 0.00-20.00 sec 83.6 GBytes 35.9 Gbits/sec 0 sender
> [ 4] 0.00-20.04 sec 83.6 GBytes 35.9 Gbits/sec receiver
replacing the erms with no-smap,no-erms copyto:
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 4] 0.00-20.00 sec 86.7 GBytes 37.2 Gbits/sec 147 sender
> [ 4] 0.00-20.04 sec 86.7 GBytes 37.2 Gbits/sec receiver
Xeon Gold 6438N (processor ID 0x806f7) (4th Gen)
with erms "rep movsb":
- - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 4] 0.00-20.00 sec 1.55 GBytes 664 Mbits/sec 0 sender
> [ 4] 0.00-20.05 sec 1.54 GBytes 661 Mbits/sec receiver
replacing the erms with no-smap,no-erms copyto:
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 4] 0.00-20.00 sec 91.9 GBytes 39.5 Gbits/sec 0 sender
> [ 4] 0.00-20.05 sec 91.9 GBytes 39.4 Gbits/sec receiver
On Ice Lake
Xeon Platinum 8352Y (processor ID 0x606a6) (3rd Gen)
with erms "rep movsb":
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 4] 0.00-20.00 sec 33.6 GBytes 14.5 Gbits/sec 0 sender
> [ 4] 0.00-20.04 sec 33.6 GBytes 14.4 Gbits/sec receiver
replacing the erms with no-smap,no-erms copyto:
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 4] 0.00-20.00 sec 76.1 GBytes 32.7 Gbits/sec 0 sender
> [ 4] 0.00-20.03 sec 76.1 GBytes 32.6 Gbits/sec receiver
>
Similar performance degradation was observed on many older platforms using the
erms codepath, and the iperf3 testing utility, with typical performance impact
between 40-60%.
--
You are receiving this mail because:
You are the assignee for the bug.