** Description changed: + SRU Justification: + + [Impact] + + * There is a wrong bucket calculation for payload of exactly 4096 bytes + in SMC stats counters. + + * SMC_STAT_PAYLOAD_SUB and SMC_STAT_RMB_SIZE_SUB have these issues. + + * The impact is that a system silently updates an incorrect stats counter + in case the underlying kernel is not UBSAN enabled. + (Difficult to detect.) + + * But if a kernel is UBSAN enabled, one will see a UBSAN + 'array-index-out-of-bounds' warning every time this occurs, like: + [ 26.335381] UBSAN: array-index-out-of-bounds in /build/linux-O6Qi7m/linux-5.15.0/net/smc/af_smc.c:2402:3 + [ 26.335385] index -1 is out of range for type 'u64 [9]' + [ 26.335388] CPU: 0 PID: 274 Comm: iperf3 Tainted: G E 5.15.0-79-generic #86-Ubuntu + [ 26.335391] Hardware name: IBM 8561 T01 772 (KVM/Linux) + [ 26.335393] Call Trace: + [ 26.335397] [<00000000cd92e63a>] dump_stack_lvl+0x62/0x80 + [ 26.335404] [<00000000cd92e36c>] ubsan_epilogue+0x1c/0x48 + [ 26.335406] [<00000000cd52d3c4>] __ubsan_handle_out_of_bounds+0x94/0xa0 + [ 26.335411] [<000003ff8033f9da>] smc_sendmsg+0x2aa/0x2d0 [smc] + [ 26.335425] [<00000000cd6a79a4>] sock_sendmsg+0x64/0x80 + [ 26.335431] [<00000000cd6a7a32>] sock_write_iter+0x72/0xa0 + [ 26.335433] [<00000000cd1d4000>] new_sync_write+0x100/0x190 + [ 26.335438] [<00000000cd1d4bb8>] vfs_write+0x1e8/0x280 + [ 26.335440] [<00000000cd1d7014>] ksys_write+0xb4/0x100 + [ 26.335442] [<00000000cd932c7c>] __do_syscall+0x1bc/0x1f0 + [ 26.335446] [<00000000cd940148>] system_call+0x78/0xa0 + + [Fix] + + * a950a5921db4 a950a5921db450c74212327f69950ff03419483a "net/smc: Fix + pos miscalculation in statistics" + + [Test Plan] + + * Since this got identified while the livepatch for jammy patches got added, + one could run a simiar (or the same) test like mentioned at LP#1639924 + (but only jammy comes with official livepatch support). + + * Alternatively a dedicated SMC stats counters test with a payload of + exactly 4096 bytes could be done (which is probably easier): + + * Install uperf (https://uperf.org/ - https://github.com/uperf/uperf). + (Wondering if it makes sense to pick this up as Debian/Ubuntu package ?!) + + * Reset SMC-D stats on client and server side. + + * Start uperf at the server side using: # uperf -vs + + * Update profile with remote IP (server IP) + and start uperf at client: # uperf -vai 5 -m rr1c-4kx4k---1.xml + with uperf profile: + # cat rr1c-4kx4k---1.xml + <?xml version="1.0"?> + <profile name="TCP_RR"> + <group nprocs="1"> + <!--group nthreads="1"--> + <!-- if we want to run processes --> + <!--group nprocs="1"--> + <transaction iterations="1"> + <flowop type="connect" options="remotehost=<remote IP> protocol=tcp tcp_nodelay" /> + </transaction> + <transaction iterations="1"> + <flowop type="write" options="size=4096"/> + <flowop type="read" options="size=4096"/> + </transaction> + <transaction iterations="1"> + <flowop type="disconnect" /> + </transaction> + </group> + </profile> + + * The smcd stats output is: + # smcd -d stats reset + SMC-D Connections Summary + Total connections handled 2 + SMC connections 2 (client 2, server 0) + v1 0 + v2 2 + Handshake errors 0 (client 0, server 0) + Avg requests per SMC conn 14.0 + TCP fallback 0 (client 0, server 0) + RX Stats + Data transmitted (Bytes) 5796 (5.796K) + Total requests 9 + Buffer full 0 (0.00%) + Buffer downgrades 0 + Buffer reuses 0 + 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB + Bufs 0 0 0 2 0 0 0 1 + Reqs 8 0 0 0 0 0 0 0 + TX Stats + Data transmitted (Bytes) 9960 (9.960K) + Total requests 19 + Buffer full 0 (0.00%) + Buffer full (remote) 0 (0.00%) + Buffer too small 0 (0.00%) + Buffer too small (remote) 0 (0.00%) + Buffer downgrades 0 + Buffer reuses 0 + 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB + Bufs 0 2 0 0 0 0 0 0 + Reqs 18 0 0 0 0 0 0 1 + Extras + Special socket calls 0 + cork 0 + nodelay 0 + sendpage 0 + splice 0 + urgent data 0 + + * Instead of including the payload in the wrong >512 KB buckets, + the output should be to have 19 reqs in the 8 KB buckets for TX stats + and 9 reqs in the 8 KB bucket for RX stats. + + [Where problems could occur] + + * The changes are in common code /net/smc/smc_stats.h + hence any potential negativ impact is not limited to a specific platform. + but limited to statistics for shared memory communication (SMC) + hardware statistics. + + * But limited to the functions SMC_STAT_PAYLOAD_SUB and + SMC_STAT_RMB_SIZE_SUB. + + * The bucket (aka pos) calculation is not that trivial, + issues here could cause other calculation error, + that may lead to slightly different, but still wrong numbers. + + * Further issues can happen in case variables in use may overflow. + + * Issues can also happen if the ternary conditional operators + are not correctly used, which again may lead to wrong calculations. + + [Other Info] + + * This patch fixes upstream e0e4b8fa5338 + ("net/smc: Add SMC statistics support"). + + * Since the fix got upstream accepted with v6.6-rc6, + not only 22.04, but also 23.04 and 23.10 are affected. + (22.10 already reached it EOS). + + * Will not affected the current development release (N-release), + since it will at least be based on a 6.6 kernel. + __________ + Bug description by Nils H.: ------------ - Overview: + Overview: ------------ The following line in the SMC stats code (net/smc/smc_stats.h) caught my attention when using a payload of exactly 4096 bytes: #define SMC_STAT_PAYLOAD_SUB(_smc_stats, _tech, key, _len, _rc) \ do { \ - typeof(_smc_stats) stats = (_smc_stats); \ - typeof(_tech) t = (_tech); \ - typeof(_len) l = (_len); \ - int _pos = fls64((l) >> 13); \ - typeof(_rc) r = (_rc); \ - int m = SMC_BUF_MAX - 1; \ - this_cpu_inc((*stats).smc[t].key ## _cnt); \ - if (r <= 0) \ - break; \ - _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m; \ <--- - this_cpu_inc((*stats).smc[t].key ## _pd.buf[_pos]); \ - this_cpu_add((*stats).smc[t].key ## _bytes, r); \ + typeof(_smc_stats) stats = (_smc_stats); \ + typeof(_tech) t = (_tech); \ + typeof(_len) l = (_len); \ + int _pos = fls64((l) >> 13); \ + typeof(_rc) r = (_rc); \ + int m = SMC_BUF_MAX - 1; \ + this_cpu_inc((*stats).smc[t].key ## _cnt); \ + if (r <= 0) \ + break; \ + _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m; \ <--- + this_cpu_inc((*stats).smc[t].key ## _pd.buf[_pos]); \ + this_cpu_add((*stats).smc[t].key ## _bytes, r); \ } \ while (0) - With l = 4096, _pos evaluates to -1. - Checking with the following uperf profile: - # cat rr1c-4kx4k---1.xml + # cat rr1c-4kx4k---1.xml <?xml version="1.0"?> <profile name="TCP_RR"> - <group nprocs="1"> - <!--group nthreads="1"--> - <!-- if we want to run processes --> - <!--group nprocs="1"--> - <transaction iterations="1"> - <flowop type="connect" options="remotehost=<remote IP> protocol=tcp tcp_nodelay" /> - </transaction> - <transaction iterations="1"> - <flowop type="write" options="size=4096"/> - <flowop type="read" options="size=4096"/> - </transaction> - <transaction iterations="1"> - <flowop type="disconnect" /> - </transaction> - </group> + <group nprocs="1"> + <!--group nthreads="1"--> + <!-- if we want to run processes --> + <!--group nprocs="1"--> + <transaction iterations="1"> + <flowop type="connect" options="remotehost=<remote IP> protocol=tcp tcp_nodelay" /> + </transaction> + <transaction iterations="1"> + <flowop type="write" options="size=4096"/> + <flowop type="read" options="size=4096"/> + </transaction> + <transaction iterations="1"> + <flowop type="disconnect" /> + </transaction> + </group> </profile> - smcd stats output: # smcd -d stats reset SMC-D Connections Summary - Total connections handled 2 - SMC connections 2 (client 2, server 0) - v1 0 - v2 2 - Handshake errors 0 (client 0, server 0) - Avg requests per SMC conn 14.0 - TCP fallback 0 (client 0, server 0) + Total connections handled 2 + SMC connections 2 (client 2, server 0) + v1 0 + v2 2 + Handshake errors 0 (client 0, server 0) + Avg requests per SMC conn 14.0 + TCP fallback 0 (client 0, server 0) RX Stats - Data transmitted (Bytes) 5796 (5.796K) - Total requests 9 - Buffer full 0 (0.00%) - Buffer downgrades 0 - Buffer reuses 0 - 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB - Bufs 0 0 0 2 0 0 0 1 - Reqs 8 0 0 0 0 0 0 0 + Data transmitted (Bytes) 5796 (5.796K) + Total requests 9 + Buffer full 0 (0.00%) + Buffer downgrades 0 + Buffer reuses 0 + 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB + Bufs 0 0 0 2 0 0 0 1 + Reqs 8 0 0 0 0 0 0 0 TX Stats - Data transmitted (Bytes) 9960 (9.960K) - Total requests 19 - Buffer full 0 (0.00%) - Buffer full (remote) 0 (0.00%) - Buffer too small 0 (0.00%) - Buffer too small (remote) 0 (0.00%) - Buffer downgrades 0 - Buffer reuses 0 - 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB - Bufs 0 2 0 0 0 0 0 0 - Reqs 18 0 0 0 0 0 0 1 + Data transmitted (Bytes) 9960 (9.960K) + Total requests 19 + Buffer full 0 (0.00%) + Buffer full (remote) 0 (0.00%) + Buffer too small 0 (0.00%) + Buffer too small (remote) 0 (0.00%) + Buffer downgrades 0 + Buffer reuses 0 + 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB + Bufs 0 2 0 0 0 0 0 0 + Reqs 18 0 0 0 0 0 0 1 Extras - Special socket calls 0 - cork 0 - nodelay 0 - sendpage 0 - splice 0 - urgent data 0 - - - Instead of including the payload in the wrong >512KB buckets, output should be to have 19 reqs in the 8KB buckets for TX stats and 9 reqs in the 8KB bucket for RX stats. - + Special socket calls 0 + cork 0 + nodelay 0 + sendpage 0 + splice 0 + urgent data 0 + + Instead of including the payload in the wrong >512KB buckets, output + should be to have 19 reqs in the 8KB buckets for TX stats and 9 reqs in + the 8KB bucket for RX stats. -------- - Repro: + Repro: -------- 0. Install uperf. 1. Reset SMC-D stats on client and server. 2. Start uperf at server side: "uperf -vs". 3. Update profile with remote IP (server IP) and start uperf at client: "uperf -vai 5 -m rr1c-4kx4k---1.xml" (uperf profile, see above)
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2039575 Title: SMC stats: Wrong bucket calculation for payload of exactly 4096 bytes Status in Ubuntu on IBM z Systems: Fix Committed Status in linux package in Ubuntu: New Status in linux source package in Jammy: Fix Committed Status in linux source package in Lunar: Fix Committed Status in linux source package in Mantic: Fix Committed Bug description: SRU Justification: [Impact] * There is a wrong bucket calculation for payload of exactly 4096 bytes in SMC stats counters. * SMC_STAT_PAYLOAD_SUB and SMC_STAT_RMB_SIZE_SUB have these issues. * The impact is that a system silently updates an incorrect stats counter in case the underlying kernel is not UBSAN enabled. (Difficult to detect.) * But if a kernel is UBSAN enabled, one will see a UBSAN 'array-index-out-of-bounds' warning every time this occurs, like: [ 26.335381] UBSAN: array-index-out-of-bounds in /build/linux-O6Qi7m/linux-5.15.0/net/smc/af_smc.c:2402:3 [ 26.335385] index -1 is out of range for type 'u64 [9]' [ 26.335388] CPU: 0 PID: 274 Comm: iperf3 Tainted: G E 5.15.0-79-generic #86-Ubuntu [ 26.335391] Hardware name: IBM 8561 T01 772 (KVM/Linux) [ 26.335393] Call Trace: [ 26.335397] [<00000000cd92e63a>] dump_stack_lvl+0x62/0x80 [ 26.335404] [<00000000cd92e36c>] ubsan_epilogue+0x1c/0x48 [ 26.335406] [<00000000cd52d3c4>] __ubsan_handle_out_of_bounds+0x94/0xa0 [ 26.335411] [<000003ff8033f9da>] smc_sendmsg+0x2aa/0x2d0 [smc] [ 26.335425] [<00000000cd6a79a4>] sock_sendmsg+0x64/0x80 [ 26.335431] [<00000000cd6a7a32>] sock_write_iter+0x72/0xa0 [ 26.335433] [<00000000cd1d4000>] new_sync_write+0x100/0x190 [ 26.335438] [<00000000cd1d4bb8>] vfs_write+0x1e8/0x280 [ 26.335440] [<00000000cd1d7014>] ksys_write+0xb4/0x100 [ 26.335442] [<00000000cd932c7c>] __do_syscall+0x1bc/0x1f0 [ 26.335446] [<00000000cd940148>] system_call+0x78/0xa0 [Fix] * a950a5921db4 a950a5921db450c74212327f69950ff03419483a "net/smc: Fix pos miscalculation in statistics" [Test Plan] * Since this got identified while the livepatch for jammy patches got added, one could run a simiar (or the same) test like mentioned at LP#1639924 (but only jammy comes with official livepatch support). * Alternatively a dedicated SMC stats counters test with a payload of exactly 4096 bytes could be done (which is probably easier): * Install uperf (https://uperf.org/ - https://github.com/uperf/uperf). (Wondering if it makes sense to pick this up as Debian/Ubuntu package ?!) * Reset SMC-D stats on client and server side. * Start uperf at the server side using: # uperf -vs * Update profile with remote IP (server IP) and start uperf at client: # uperf -vai 5 -m rr1c-4kx4k---1.xml with uperf profile: # cat rr1c-4kx4k---1.xml <?xml version="1.0"?> <profile name="TCP_RR"> <group nprocs="1"> <!--group nthreads="1"--> <!-- if we want to run processes --> <!--group nprocs="1"--> <transaction iterations="1"> <flowop type="connect" options="remotehost=<remote IP> protocol=tcp tcp_nodelay" /> </transaction> <transaction iterations="1"> <flowop type="write" options="size=4096"/> <flowop type="read" options="size=4096"/> </transaction> <transaction iterations="1"> <flowop type="disconnect" /> </transaction> </group> </profile> * The smcd stats output is: # smcd -d stats reset SMC-D Connections Summary Total connections handled 2 SMC connections 2 (client 2, server 0) v1 0 v2 2 Handshake errors 0 (client 0, server 0) Avg requests per SMC conn 14.0 TCP fallback 0 (client 0, server 0) RX Stats Data transmitted (Bytes) 5796 (5.796K) Total requests 9 Buffer full 0 (0.00%) Buffer downgrades 0 Buffer reuses 0 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB Bufs 0 0 0 2 0 0 0 1 Reqs 8 0 0 0 0 0 0 0 TX Stats Data transmitted (Bytes) 9960 (9.960K) Total requests 19 Buffer full 0 (0.00%) Buffer full (remote) 0 (0.00%) Buffer too small 0 (0.00%) Buffer too small (remote) 0 (0.00%) Buffer downgrades 0 Buffer reuses 0 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB Bufs 0 2 0 0 0 0 0 0 Reqs 18 0 0 0 0 0 0 1 Extras Special socket calls 0 cork 0 nodelay 0 sendpage 0 splice 0 urgent data 0 * Instead of including the payload in the wrong >512 KB buckets, the output should be to have 19 reqs in the 8 KB buckets for TX stats and 9 reqs in the 8 KB bucket for RX stats. [Where problems could occur] * The changes are in common code /net/smc/smc_stats.h hence any potential negativ impact is not limited to a specific platform. but limited to statistics for shared memory communication (SMC) hardware statistics. * But limited to the functions SMC_STAT_PAYLOAD_SUB and SMC_STAT_RMB_SIZE_SUB. * The bucket (aka pos) calculation is not that trivial, issues here could cause other calculation error, that may lead to slightly different, but still wrong numbers. * Further issues can happen in case variables in use may overflow. * Issues can also happen if the ternary conditional operators are not correctly used, which again may lead to wrong calculations. [Other Info] * This patch fixes upstream e0e4b8fa5338 ("net/smc: Add SMC statistics support"). * Since the fix got upstream accepted with v6.6-rc6, not only 22.04, but also 23.04 and 23.10 are affected. (22.10 already reached it EOS). * Will not affected the current development release (N-release), since it will at least be based on a 6.6 kernel. __________ Bug description by Nils H.: ------------ Overview: ------------ The following line in the SMC stats code (net/smc/smc_stats.h) caught my attention when using a payload of exactly 4096 bytes: #define SMC_STAT_PAYLOAD_SUB(_smc_stats, _tech, key, _len, _rc) \ do { \ typeof(_smc_stats) stats = (_smc_stats); \ typeof(_tech) t = (_tech); \ typeof(_len) l = (_len); \ int _pos = fls64((l) >> 13); \ typeof(_rc) r = (_rc); \ int m = SMC_BUF_MAX - 1; \ this_cpu_inc((*stats).smc[t].key ## _cnt); \ if (r <= 0) \ break; \ _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m; \ <--- this_cpu_inc((*stats).smc[t].key ## _pd.buf[_pos]); \ this_cpu_add((*stats).smc[t].key ## _bytes, r); \ } \ while (0) With l = 4096, _pos evaluates to -1. Checking with the following uperf profile: # cat rr1c-4kx4k---1.xml <?xml version="1.0"?> <profile name="TCP_RR"> <group nprocs="1"> <!--group nthreads="1"--> <!-- if we want to run processes --> <!--group nprocs="1"--> <transaction iterations="1"> <flowop type="connect" options="remotehost=<remote IP> protocol=tcp tcp_nodelay" /> </transaction> <transaction iterations="1"> <flowop type="write" options="size=4096"/> <flowop type="read" options="size=4096"/> </transaction> <transaction iterations="1"> <flowop type="disconnect" /> </transaction> </group> </profile> smcd stats output: # smcd -d stats reset SMC-D Connections Summary Total connections handled 2 SMC connections 2 (client 2, server 0) v1 0 v2 2 Handshake errors 0 (client 0, server 0) Avg requests per SMC conn 14.0 TCP fallback 0 (client 0, server 0) RX Stats Data transmitted (Bytes) 5796 (5.796K) Total requests 9 Buffer full 0 (0.00%) Buffer downgrades 0 Buffer reuses 0 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB Bufs 0 0 0 2 0 0 0 1 Reqs 8 0 0 0 0 0 0 0 TX Stats Data transmitted (Bytes) 9960 (9.960K) Total requests 19 Buffer full 0 (0.00%) Buffer full (remote) 0 (0.00%) Buffer too small 0 (0.00%) Buffer too small (remote) 0 (0.00%) Buffer downgrades 0 Buffer reuses 0 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB Bufs 0 2 0 0 0 0 0 0 Reqs 18 0 0 0 0 0 0 1 Extras Special socket calls 0 cork 0 nodelay 0 sendpage 0 splice 0 urgent data 0 Instead of including the payload in the wrong >512KB buckets, output should be to have 19 reqs in the 8KB buckets for TX stats and 9 reqs in the 8KB bucket for RX stats. -------- Repro: -------- 0. Install uperf. 1. Reset SMC-D stats on client and server. 2. Start uperf at server side: "uperf -vs". 3. Update profile with remote IP (server IP) and start uperf at client: "uperf -vai 5 -m rr1c-4kx4k---1.xml" (uperf profile, see above) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/2039575/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp