** Description changed:

+ SRU Justification:
+ 
+ [Impact]
+ 
+  * There is a wrong bucket calculation for payload of exactly 4096 bytes
+    in SMC stats counters.
+ 
+  * SMC_STAT_PAYLOAD_SUB and SMC_STAT_RMB_SIZE_SUB have these issues.
+ 
+  * The impact is that a system silently updates an incorrect stats counter
+    in case the underlying kernel is not UBSAN enabled.
+    (Difficult to detect.)
+ 
+  * But if a kernel is UBSAN enabled, one will see a UBSAN
+    'array-index-out-of-bounds' warning every time this occurs, like:
+    [ 26.335381] UBSAN: array-index-out-of-bounds in 
/build/linux-O6Qi7m/linux-5.15.0/net/smc/af_smc.c:2402:3
+    [ 26.335385] index -1 is out of range for type 'u64 [9]'
+    [ 26.335388] CPU: 0 PID: 274 Comm: iperf3 Tainted: G E 5.15.0-79-generic 
#86-Ubuntu
+    [ 26.335391] Hardware name: IBM 8561 T01 772 (KVM/Linux)
+    [ 26.335393] Call Trace:
+    [ 26.335397] [<00000000cd92e63a>] dump_stack_lvl+0x62/0x80
+    [ 26.335404] [<00000000cd92e36c>] ubsan_epilogue+0x1c/0x48
+    [ 26.335406] [<00000000cd52d3c4>] __ubsan_handle_out_of_bounds+0x94/0xa0
+    [ 26.335411] [<000003ff8033f9da>] smc_sendmsg+0x2aa/0x2d0 [smc]
+    [ 26.335425] [<00000000cd6a79a4>] sock_sendmsg+0x64/0x80
+    [ 26.335431] [<00000000cd6a7a32>] sock_write_iter+0x72/0xa0
+    [ 26.335433] [<00000000cd1d4000>] new_sync_write+0x100/0x190
+    [ 26.335438] [<00000000cd1d4bb8>] vfs_write+0x1e8/0x280
+    [ 26.335440] [<00000000cd1d7014>] ksys_write+0xb4/0x100
+    [ 26.335442] [<00000000cd932c7c>] __do_syscall+0x1bc/0x1f0
+    [ 26.335446] [<00000000cd940148>] system_call+0x78/0xa0
+ 
+ [Fix]
+ 
+  * a950a5921db4 a950a5921db450c74212327f69950ff03419483a "net/smc: Fix
+ pos miscalculation in statistics"
+ 
+ [Test Plan]
+ 
+  * Since this got identified while the livepatch for jammy patches got added,
+    one could run a simiar (or the same) test like mentioned at LP#1639924
+    (but only jammy comes with official livepatch support).
+ 
+  * Alternatively a dedicated SMC stats counters test with a payload of
+    exactly 4096 bytes could be done (which is probably easier):
+ 
+  * Install uperf (https://uperf.org/ - https://github.com/uperf/uperf).
+    (Wondering if it makes sense to pick this up as Debian/Ubuntu package ?!)
+ 
+  * Reset SMC-D stats on client and server side.
+ 
+  * Start uperf at the server side using: # uperf -vs
+ 
+  * Update profile with remote IP (server IP)
+    and start uperf at client: # uperf -vai 5 -m rr1c-4kx4k---1.xml
+    with uperf profile:
+    # cat rr1c-4kx4k---1.xml
+    <?xml version="1.0"?>
+    <profile name="TCP_RR">
+     <group nprocs="1">
+     <!--group nthreads="1"-->
+     <!-- if we want to run processes -->
+     <!--group nprocs="1"-->
+      <transaction iterations="1">
+       <flowop type="connect" options="remotehost=<remote IP> protocol=tcp 
tcp_nodelay" />
+      </transaction>
+      <transaction iterations="1">
+       <flowop type="write" options="size=4096"/>
+       <flowop type="read" options="size=4096"/>
+      </transaction>
+      <transaction iterations="1">
+       <flowop type="disconnect" />
+      </transaction>
+     </group>
+    </profile>
+  
+  * The smcd stats output is:
+    # smcd -d stats reset
+    SMC-D Connections Summary
+      Total connections handled 2
+      SMC connections 2 (client 2, server 0)
+        v1 0
+        v2 2
+      Handshake errors 0 (client 0, server 0)
+      Avg requests per SMC conn 14.0
+      TCP fallback 0 (client 0, server 0)
+    RX Stats
+      Data transmitted (Bytes) 5796 (5.796K)
+      Total requests 9
+      Buffer full 0 (0.00%)
+      Buffer downgrades 0
+      Buffer reuses 0
+                8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB
+      Bufs 0 0 0 2 0 0 0 1
+      Reqs 8 0 0 0 0 0 0 0
+    TX Stats
+      Data transmitted (Bytes) 9960 (9.960K)
+      Total requests 19
+      Buffer full 0 (0.00%)
+      Buffer full (remote) 0 (0.00%)
+      Buffer too small 0 (0.00%)
+      Buffer too small (remote) 0 (0.00%)
+      Buffer downgrades 0
+      Buffer reuses 0
+                8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB
+      Bufs 0 2 0 0 0 0 0 0
+      Reqs 18 0 0 0 0 0 0 1
+    Extras
+      Special socket calls 0
+        cork 0
+        nodelay 0
+        sendpage 0
+        splice 0
+        urgent data 0
+ 
+  * Instead of including the payload in the wrong >512 KB buckets,
+    the output should be to have 19 reqs in the 8 KB buckets for TX stats
+    and 9 reqs in the 8 KB bucket for RX stats.
+ 
+ [Where problems could occur]
+ 
+  * The changes are in common code /net/smc/smc_stats.h
+    hence any potential negativ impact is not limited to a specific platform.
+    but limited to statistics for shared memory communication (SMC)
+    hardware statistics.
+ 
+  * But limited to the functions SMC_STAT_PAYLOAD_SUB and
+    SMC_STAT_RMB_SIZE_SUB.
+ 
+  * The bucket (aka pos) calculation is not that trivial,
+    issues here could cause other calculation error,
+    that may lead to slightly different, but still wrong numbers.
+ 
+  * Further issues can happen in case variables in use may overflow.
+ 
+  * Issues can also happen if the ternary conditional operators
+    are not correctly used, which again may lead to wrong calculations.
+ 
+ [Other Info]
+ 
+  * This patch fixes upstream e0e4b8fa5338
+    ("net/smc: Add SMC statistics support").
+ 
+  * Since the fix got upstream accepted with v6.6-rc6,
+    not only 22.04, but also 23.04 and 23.10 are affected.
+    (22.10 already reached it EOS).
+ 
+  * Will not affected the current development release (N-release),
+    since it will at least be based on a 6.6 kernel.
+ __________
+ 
  Bug description by Nils H.:
  
  ------------
-  Overview: 
+  Overview:
  ------------
  The following line in the SMC stats code (net/smc/smc_stats.h) caught my 
attention when using a payload of exactly 4096 bytes:
  
  #define SMC_STAT_PAYLOAD_SUB(_smc_stats, _tech, key, _len, _rc) \
  do { \
-       typeof(_smc_stats) stats = (_smc_stats); \
-       typeof(_tech) t = (_tech); \
-       typeof(_len) l = (_len); \
-       int _pos = fls64((l) >> 13); \
-       typeof(_rc) r = (_rc); \
-       int m = SMC_BUF_MAX - 1; \
-       this_cpu_inc((*stats).smc[t].key ## _cnt); \
-       if (r <= 0) \
-               break; \
-       _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m; \  
<---
-       this_cpu_inc((*stats).smc[t].key ## _pd.buf[_pos]); \
-       this_cpu_add((*stats).smc[t].key ## _bytes, r); \
+  typeof(_smc_stats) stats = (_smc_stats); \
+  typeof(_tech) t = (_tech); \
+  typeof(_len) l = (_len); \
+  int _pos = fls64((l) >> 13); \
+  typeof(_rc) r = (_rc); \
+  int m = SMC_BUF_MAX - 1; \
+  this_cpu_inc((*stats).smc[t].key ## _cnt); \
+  if (r <= 0) \
+   break; \
+  _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m; \  <---
+  this_cpu_inc((*stats).smc[t].key ## _pd.buf[_pos]); \
+  this_cpu_add((*stats).smc[t].key ## _bytes, r); \
  } \
  while (0)
  
- 
  With l = 4096, _pos evaluates to -1.
  
- 
  Checking with the following uperf profile:
- # cat rr1c-4kx4k---1.xml 
+ # cat rr1c-4kx4k---1.xml
  <?xml version="1.0"?>
  <profile name="TCP_RR">
-       <group nprocs="1">
-       <!--group nthreads="1"-->
-       <!-- if we want to run processes -->
-       <!--group nprocs="1"--> 
-               <transaction iterations="1">
-                       <flowop type="connect" options="remotehost=<remote IP> 
protocol=tcp  tcp_nodelay" />
-               </transaction>
-               <transaction iterations="1">
-                       <flowop type="write" options="size=4096"/>
-                       <flowop type="read" options="size=4096"/>
-               </transaction>
-               <transaction iterations="1">
-                       <flowop type="disconnect" />
-               </transaction>
-       </group>
+  <group nprocs="1">
+  <!--group nthreads="1"-->
+  <!-- if we want to run processes -->
+  <!--group nprocs="1"-->
+   <transaction iterations="1">
+    <flowop type="connect" options="remotehost=<remote IP> protocol=tcp  
tcp_nodelay" />
+   </transaction>
+   <transaction iterations="1">
+    <flowop type="write" options="size=4096"/>
+    <flowop type="read" options="size=4096"/>
+   </transaction>
+   <transaction iterations="1">
+    <flowop type="disconnect" />
+   </transaction>
+  </group>
  </profile>
- 
  
  smcd stats output:
  # smcd -d stats reset
  SMC-D Connections Summary
-   Total connections handled             2
-   SMC connections                       2 (client 2, server 0)
-     v1                                  0
-     v2                                  2
-   Handshake errors                      0 (client 0, server 0)
-   Avg requests per SMC conn            14.0
-   TCP fallback                          0 (client 0, server 0)
+   Total connections handled             2
+   SMC connections                       2 (client 2, server 0)
+     v1                                  0
+     v2                                  2
+   Handshake errors                      0 (client 0, server 0)
+   Avg requests per SMC conn            14.0
+   TCP fallback                          0 (client 0, server 0)
  
  RX Stats
-   Data transmitted (Bytes)           5796 (5.796K)
-   Total requests                        9
-   Buffer full                           0 (0.00%)
-   Buffer downgrades                     0
-   Buffer reuses                         0
-             8KB    16KB    32KB    64KB   128KB   256KB   512KB  >512KB
-   Bufs        0       0       0       2       0       0       0       1
-   Reqs        8       0       0       0       0       0       0       0
+   Data transmitted (Bytes)           5796 (5.796K)
+   Total requests                        9
+   Buffer full                           0 (0.00%)
+   Buffer downgrades                     0
+   Buffer reuses                         0
+             8KB    16KB    32KB    64KB   128KB   256KB   512KB  >512KB
+   Bufs        0       0       0       2       0       0       0       1
+   Reqs        8       0       0       0       0       0       0       0
  
  TX Stats
-   Data transmitted (Bytes)           9960 (9.960K)
-   Total requests                       19
-   Buffer full                           0 (0.00%)
-   Buffer full (remote)                  0 (0.00%)
-   Buffer too small                      0 (0.00%)
-   Buffer too small (remote)             0 (0.00%)
-   Buffer downgrades                     0
-   Buffer reuses                         0
-             8KB    16KB    32KB    64KB   128KB   256KB   512KB  >512KB
-   Bufs        0       2       0       0       0       0       0       0
-   Reqs       18       0       0       0       0       0       0       1
+   Data transmitted (Bytes)           9960 (9.960K)
+   Total requests                       19
+   Buffer full                           0 (0.00%)
+   Buffer full (remote)                  0 (0.00%)
+   Buffer too small                      0 (0.00%)
+   Buffer too small (remote)             0 (0.00%)
+   Buffer downgrades                     0
+   Buffer reuses                         0
+             8KB    16KB    32KB    64KB   128KB   256KB   512KB  >512KB
+   Bufs        0       2       0       0       0       0       0       0
+   Reqs       18       0       0       0       0       0       0       1
  
  Extras
-   Special socket calls                  0
-     cork                                0
-     nodelay                             0
-     sendpage                            0
-     splice                              0
-     urgent data                         0
- 
- 
- Instead of including the payload in the wrong >512KB buckets, output should 
be to have 19 reqs in the 8KB buckets for TX stats and 9 reqs in the 8KB bucket 
for RX stats.
- 
+   Special socket calls                  0
+     cork                                0
+     nodelay                             0
+     sendpage                            0
+     splice                              0
+     urgent data                         0
+ 
+ Instead of including the payload in the wrong >512KB buckets, output
+ should be to have 19 reqs in the 8KB buckets for TX stats and 9 reqs in
+ the 8KB bucket for RX stats.
  
  --------
-  Repro:
+  Repro:
  --------
  0. Install uperf.
  1. Reset SMC-D stats on client and server.
  2. Start uperf at server side: "uperf -vs".
  3. Update profile with remote IP (server IP) and start uperf at client: 
"uperf -vai 5 -m rr1c-4kx4k---1.xml" (uperf profile, see above)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2039575

Title:
  SMC stats: Wrong bucket calculation for payload of exactly 4096 bytes

Status in Ubuntu on IBM z Systems:
  Fix Committed
Status in linux package in Ubuntu:
  New
Status in linux source package in Jammy:
  Fix Committed
Status in linux source package in Lunar:
  Fix Committed
Status in linux source package in Mantic:
  Fix Committed

Bug description:
  SRU Justification:

  [Impact]

   * There is a wrong bucket calculation for payload of exactly 4096 bytes
     in SMC stats counters.

   * SMC_STAT_PAYLOAD_SUB and SMC_STAT_RMB_SIZE_SUB have these issues.

   * The impact is that a system silently updates an incorrect stats counter
     in case the underlying kernel is not UBSAN enabled.
     (Difficult to detect.)

   * But if a kernel is UBSAN enabled, one will see a UBSAN
     'array-index-out-of-bounds' warning every time this occurs, like:
     [ 26.335381] UBSAN: array-index-out-of-bounds in 
/build/linux-O6Qi7m/linux-5.15.0/net/smc/af_smc.c:2402:3
     [ 26.335385] index -1 is out of range for type 'u64 [9]'
     [ 26.335388] CPU: 0 PID: 274 Comm: iperf3 Tainted: G E 5.15.0-79-generic 
#86-Ubuntu
     [ 26.335391] Hardware name: IBM 8561 T01 772 (KVM/Linux)
     [ 26.335393] Call Trace:
     [ 26.335397] [<00000000cd92e63a>] dump_stack_lvl+0x62/0x80
     [ 26.335404] [<00000000cd92e36c>] ubsan_epilogue+0x1c/0x48
     [ 26.335406] [<00000000cd52d3c4>] __ubsan_handle_out_of_bounds+0x94/0xa0
     [ 26.335411] [<000003ff8033f9da>] smc_sendmsg+0x2aa/0x2d0 [smc]
     [ 26.335425] [<00000000cd6a79a4>] sock_sendmsg+0x64/0x80
     [ 26.335431] [<00000000cd6a7a32>] sock_write_iter+0x72/0xa0
     [ 26.335433] [<00000000cd1d4000>] new_sync_write+0x100/0x190
     [ 26.335438] [<00000000cd1d4bb8>] vfs_write+0x1e8/0x280
     [ 26.335440] [<00000000cd1d7014>] ksys_write+0xb4/0x100
     [ 26.335442] [<00000000cd932c7c>] __do_syscall+0x1bc/0x1f0
     [ 26.335446] [<00000000cd940148>] system_call+0x78/0xa0

  [Fix]

   * a950a5921db4 a950a5921db450c74212327f69950ff03419483a "net/smc: Fix
  pos miscalculation in statistics"

  [Test Plan]

   * Since this got identified while the livepatch for jammy patches got added,
     one could run a simiar (or the same) test like mentioned at LP#1639924
     (but only jammy comes with official livepatch support).

   * Alternatively a dedicated SMC stats counters test with a payload of
     exactly 4096 bytes could be done (which is probably easier):

   * Install uperf (https://uperf.org/ - https://github.com/uperf/uperf).
     (Wondering if it makes sense to pick this up as Debian/Ubuntu package ?!)

   * Reset SMC-D stats on client and server side.

   * Start uperf at the server side using: # uperf -vs

   * Update profile with remote IP (server IP)
     and start uperf at client: # uperf -vai 5 -m rr1c-4kx4k---1.xml
     with uperf profile:
     # cat rr1c-4kx4k---1.xml
     <?xml version="1.0"?>
     <profile name="TCP_RR">
      <group nprocs="1">
      <!--group nthreads="1"-->
      <!-- if we want to run processes -->
      <!--group nprocs="1"-->
       <transaction iterations="1">
        <flowop type="connect" options="remotehost=<remote IP> protocol=tcp 
tcp_nodelay" />
       </transaction>
       <transaction iterations="1">
        <flowop type="write" options="size=4096"/>
        <flowop type="read" options="size=4096"/>
       </transaction>
       <transaction iterations="1">
        <flowop type="disconnect" />
       </transaction>
      </group>
     </profile>
   
   * The smcd stats output is:
     # smcd -d stats reset
     SMC-D Connections Summary
       Total connections handled 2
       SMC connections 2 (client 2, server 0)
         v1 0
         v2 2
       Handshake errors 0 (client 0, server 0)
       Avg requests per SMC conn 14.0
       TCP fallback 0 (client 0, server 0)
     RX Stats
       Data transmitted (Bytes) 5796 (5.796K)
       Total requests 9
       Buffer full 0 (0.00%)
       Buffer downgrades 0
       Buffer reuses 0
                 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB
       Bufs 0 0 0 2 0 0 0 1
       Reqs 8 0 0 0 0 0 0 0
     TX Stats
       Data transmitted (Bytes) 9960 (9.960K)
       Total requests 19
       Buffer full 0 (0.00%)
       Buffer full (remote) 0 (0.00%)
       Buffer too small 0 (0.00%)
       Buffer too small (remote) 0 (0.00%)
       Buffer downgrades 0
       Buffer reuses 0
                 8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB
       Bufs 0 2 0 0 0 0 0 0
       Reqs 18 0 0 0 0 0 0 1
     Extras
       Special socket calls 0
         cork 0
         nodelay 0
         sendpage 0
         splice 0
         urgent data 0

   * Instead of including the payload in the wrong >512 KB buckets,
     the output should be to have 19 reqs in the 8 KB buckets for TX stats
     and 9 reqs in the 8 KB bucket for RX stats.

  [Where problems could occur]

   * The changes are in common code /net/smc/smc_stats.h
     hence any potential negativ impact is not limited to a specific platform.
     but limited to statistics for shared memory communication (SMC)
     hardware statistics.

   * But limited to the functions SMC_STAT_PAYLOAD_SUB and
     SMC_STAT_RMB_SIZE_SUB.

   * The bucket (aka pos) calculation is not that trivial,
     issues here could cause other calculation error,
     that may lead to slightly different, but still wrong numbers.

   * Further issues can happen in case variables in use may overflow.

   * Issues can also happen if the ternary conditional operators
     are not correctly used, which again may lead to wrong calculations.

  [Other Info]

   * This patch fixes upstream e0e4b8fa5338
     ("net/smc: Add SMC statistics support").

   * Since the fix got upstream accepted with v6.6-rc6,
     not only 22.04, but also 23.04 and 23.10 are affected.
     (22.10 already reached it EOS).

   * Will not affected the current development release (N-release),
     since it will at least be based on a 6.6 kernel.
  __________

  Bug description by Nils H.:

  ------------
   Overview:
  ------------
  The following line in the SMC stats code (net/smc/smc_stats.h) caught my 
attention when using a payload of exactly 4096 bytes:

  #define SMC_STAT_PAYLOAD_SUB(_smc_stats, _tech, key, _len, _rc) \
  do { \
   typeof(_smc_stats) stats = (_smc_stats); \
   typeof(_tech) t = (_tech); \
   typeof(_len) l = (_len); \
   int _pos = fls64((l) >> 13); \
   typeof(_rc) r = (_rc); \
   int m = SMC_BUF_MAX - 1; \
   this_cpu_inc((*stats).smc[t].key ## _cnt); \
   if (r <= 0) \
    break; \
   _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m; \  <---
   this_cpu_inc((*stats).smc[t].key ## _pd.buf[_pos]); \
   this_cpu_add((*stats).smc[t].key ## _bytes, r); \
  } \
  while (0)

  With l = 4096, _pos evaluates to -1.

  Checking with the following uperf profile:
  # cat rr1c-4kx4k---1.xml
  <?xml version="1.0"?>
  <profile name="TCP_RR">
   <group nprocs="1">
   <!--group nthreads="1"-->
   <!-- if we want to run processes -->
   <!--group nprocs="1"-->
    <transaction iterations="1">
     <flowop type="connect" options="remotehost=<remote IP> protocol=tcp  
tcp_nodelay" />
    </transaction>
    <transaction iterations="1">
     <flowop type="write" options="size=4096"/>
     <flowop type="read" options="size=4096"/>
    </transaction>
    <transaction iterations="1">
     <flowop type="disconnect" />
    </transaction>
   </group>
  </profile>

  smcd stats output:
  # smcd -d stats reset
  SMC-D Connections Summary
    Total connections handled             2
    SMC connections                       2 (client 2, server 0)
      v1                                  0
      v2                                  2
    Handshake errors                      0 (client 0, server 0)
    Avg requests per SMC conn            14.0
    TCP fallback                          0 (client 0, server 0)

  RX Stats
    Data transmitted (Bytes)           5796 (5.796K)
    Total requests                        9
    Buffer full                           0 (0.00%)
    Buffer downgrades                     0
    Buffer reuses                         0
              8KB    16KB    32KB    64KB   128KB   256KB   512KB  >512KB
    Bufs        0       0       0       2       0       0       0       1
    Reqs        8       0       0       0       0       0       0       0

  TX Stats
    Data transmitted (Bytes)           9960 (9.960K)
    Total requests                       19
    Buffer full                           0 (0.00%)
    Buffer full (remote)                  0 (0.00%)
    Buffer too small                      0 (0.00%)
    Buffer too small (remote)             0 (0.00%)
    Buffer downgrades                     0
    Buffer reuses                         0
              8KB    16KB    32KB    64KB   128KB   256KB   512KB  >512KB
    Bufs        0       2       0       0       0       0       0       0
    Reqs       18       0       0       0       0       0       0       1

  Extras
    Special socket calls                  0
      cork                                0
      nodelay                             0
      sendpage                            0
      splice                              0
      urgent data                         0

  Instead of including the payload in the wrong >512KB buckets, output
  should be to have 19 reqs in the 8KB buckets for TX stats and 9 reqs
  in the 8KB bucket for RX stats.

  --------
   Repro:
  --------
  0. Install uperf.
  1. Reset SMC-D stats on client and server.
  2. Start uperf at server side: "uperf -vs".
  3. Update profile with remote IP (server IP) and start uperf at client: 
"uperf -vai 5 -m rr1c-4kx4k---1.xml" (uperf profile, see above)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2039575/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to