On 21.4.2021. 23:28, Hrvoje Popovski wrote:
> On 21.4.2021. 21:36, Alexander Bluhm wrote:
>> Hi,
>>
>> For a while we are running network without kernel lock, but with a
>> network lock.  The latter is an exclusive sleeping rwlock.
>>
>> It is possible to run the forwarding path in parallel on multiple
>> cores.  I use ix(4) interfaces which provide one input queue for
>> each CPU.  For that we have to start multiple softnet tasks and
>> replace the exclusive lock with a shared lock.  This works for IP
>> and IPv6 input and forwarding, but not for higher protocols.
>>
>> So I implement a queue between IP and higher layers.  We had that
>> before when we were using netlock for IP and kernel lock for TCP.
>> Now we have shared lock for IP and exclusive lock for TCP.  By using
>> a queue, we can upgrade the lock once for multiple packets.
>>
>> As you can see here, forwardings performance doubles from 4.5x10^9
>> to 9x10^9 .  Left column is current, right column is with my diff.
>> The other dots at 2x10^9 are with socket splicing which is not
>> affected.
>> http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/gnuplot/forward.png
>>
>> Here are all numbers with various network tests.
>> http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/perform.html
>> TCP performance gets less deterministic due to the addition queue.
>>
>> Kernel stack flame graph looks like this.  Machine uses 4 CPU.
>> http://bluhm.genua.de/files/kstack-multiqueue-forward.svg
>>
>> Note the kernel lock around nd6_resolve().  I hat to put it there
>> as I have seen an MP related crash there.  This can be fixed
>> independently of this diff.
>>
>> We need more MP preassure to find such bugs and races.  I think now
>> is a good time to give this diff broader testing and commit it.
>> You need interfaces with multiple queues to see a difference.
>>
>> ok?
> Hi,
> 
> with this diff i'm getting panic when i'm pushing traffic over that box.
> This is plain forwarding. To compile with witness ?


with witness

x3550m4# panic: pool_cache_item_magic_check: mbufpl cpu free list
modified: item addr 0xfffffd8066b5e5
00+16 0xfffffd8066b5e570!=0x1474deeb99bfdf06
Stopped at      db_enter+0x10:  popq    %rbp
    TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
*211939  58019      0     0x14000      0x200    1  softnet
 173790  68166      0     0x14000      0x200    3  softnet
  45539  46127      0     0x14000      0x200    2  softnet
 358228  28782      0     0x14000      0x200    4  softnet
db_enter() at db_enter+0x10
panic(ffffffff81df726e) at panic+0x12a
pool_cache_get(ffffffff82203378) at pool_cache_get+0x25b
pool_get(ffffffff82203378,2) at pool_get+0x5e
m_clget(0,2,802) at m_clget+0xdd
ixgbe_get_buf(ffff80000015c9f8,a) at ixgbe_get_buf+0xa3
ixgbe_rxfill(ffff80000015c9f8) at ixgbe_rxfill+0x93
ixgbe_queue_intr(ffff80000011aec0) at ixgbe_queue_intr+0x4f
intr_handler(ffff800026df9740,ffff8000000cc500) at intr_handler+0x6e
Xintr_ioapic_edge0_untramp() at Xintr_ioapic_edge0_untramp+0x18f
ip_forward(fffffd8066b58400,ffff80000015a048,fffffd878909fa80,0) at
ip_forward+0x1de
ip_input_if(ffff800026df9a38,ffff800026df9a44,4,0,ffff80000015a048) at
ip_input_if+0x608
ipv4_input(ffff80000015a048,fffffd8066b58400) at ipv4_input+0x39
if_input_process(ffff80000015a048,ffff800026df9ab8) at if_input_process+0x6f
end trace frame: 0xffff800026df9b00, count: 0
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.


ddb{1}> show panic
pool_cache_item_magic_check: mbufpl cpu free list modified: item addr
0xfffffd8
066b5e500+16 0xfffffd8066b5e570!=0x1474deeb99bfdf06


ddb{1}> trace
db_enter() at db_enter+0x10
panic(ffffffff81df726e) at panic+0x12a
pool_cache_get(ffffffff82203378) at pool_cache_get+0x25b
pool_get(ffffffff82203378,2) at pool_get+0x5e
m_clget(0,2,802) at m_clget+0xdd
ixgbe_get_buf(ffff80000015c9f8,a) at ixgbe_get_buf+0xa3
ixgbe_rxfill(ffff80000015c9f8) at ixgbe_rxfill+0x93
ixgbe_queue_intr(ffff80000011aec0) at ixgbe_queue_intr+0x4f
intr_handler(ffff800026df9740,ffff8000000cc500) at intr_handler+0x6e
Xintr_ioapic_edge0_untramp() at Xintr_ioapic_edge0_untramp+0x18f
ip_forward(fffffd8066b58400,ffff80000015a048,fffffd878909fa80,0) at
ip_forward+0x1de
ip_input_if(ffff800026df9a38,ffff800026df9a44,4,0,ffff80000015a048) at
ip_input_if+0x608
ipv4_input(ffff80000015a048,fffffd8066b58400) at ipv4_input+0x39
if_input_process(ffff80000015a048,ffff800026df9ab8) at if_input_process+0x6f
ifiq_process(ffff80000015ef00) at ifiq_process+0x69
taskq_thread(ffff800000030300) at taskq_thread+0x9f
end trace frame: 0x0, count: -16


ddb{1}> show locks
shared rwlock netlock r = 0 (0xffffffff82119770)
#0  witness_lock+0x339
#1  if_input_process+0x43
#2  ifiq_process+0x69
#3  taskq_thread+0x9f
#4  proc_trampoline+0x1c
shared rwlock softnet r = 0 (0xffff800000030370)
#0  witness_lock+0x339
#1  taskq_thread+0x92
#2  proc_trampoline+0x1c


ddb{1}> show all locks
CPU 3:
exclusive mutex softnet r = 0 (0xffff800000030228)
#0  witness_lock+0x339
#1  mtx_enter_try+0x95
#2  mtx_enter+0x48
#3  msleep+0xe5
#4  taskq_next_work+0x61
#5  taskq_thread+0xce
#6  proc_trampoline+0x1c
Process 58019 (softnet) thread 0xffff800026dc87e0 (211939)
shared rwlock netlock r = 0 (0xffffffff82119770)
#0  witness_lock+0x339
#1  if_input_process+0x43
#2  ifiq_process+0x69
#3  taskq_thread+0x9f
#4  proc_trampoline+0x1c
shared rwlock softnet r = 0 (0xffff800000030370)
#0  witness_lock+0x339
#1  taskq_thread+0x92
#2  proc_trampoline+0x1c
Process 28782 (softnet) thread 0xffff800026dc8000 (358228)
shared rwlock softnet r = 0 (0xffff800000030070)
#0  witness_lock+0x339
#1  taskq_thread+0x92
#2  proc_trampoline+0x1c


ddb{1}> show all pools
Name      Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg
Maxpg Idle
arp         64       14    0        0     1     0     1     1     0
8    0
plcache    128      120    0        0     4     0     4     4     0
8    0
rtpcb      120       21    0       20     1     0     1     1     0
8    0
rtentry    112       52    0        0     2     0     2     2     0
8    0
unpcb      120       66    0       18     2     0     2     2     0
8    0
tcpcb      736        7    0        2     1     0     1     1     0
8    0
inpcb      304      205    0      197     1     0     1     1     0
8    0
art_heap8  4096       1    0        0     1     0     1     1     0
8    0
art_heap4  256      120    0        0     8     0     8     8     0
8    0
art_table   32      121    0        0     1     0     1     1     0
8    0
art_node    16       52    0        0     1     0     1     1     0
8    0
dirhash    1024      87    0       40     6     0     6     6     0
8    0
newdirblk   32       16    0       16     1     1     0     1     0
8    0
dirrem      64     1646    0     1646    27    27     0    27     0
8    0
mkdir       56       20    0       20     1     1     0     1     0
8    0
diradd      56     1658    0     1657    23    22     1    23     0
8    0
freefile    48     1627    0     1627    20    20     0    20     0
8    0
freeblks   192     1647    0     1647    81    81     0    81     0
8    0
freefrag    64       15    0       15     2     2     0     1     0
8    0
allocindir 104    10941    0    10941   236   225    11   211     0
8   11
indirdep    56       16    0       16     1     0     1     1     0
8    1
allocdir   128     2755    0     2755    76    75     1    76     0
8    1
bmsafemap   64       46    0       46     2     2     0     1     0
8    0
newblk      64    13696    0    13696     3     3     0     1     0
8    0
inodedep   160     1709    0     1707    70    68     2    70     0
8    1
pagedep    128       32    0       31     1     0     1     1     0
8    0
dino1pl    128     5313    0     1633   121     2   119   119     0
8    0
ffsino     272     5313    0     1633   250     4   246   246     0
8    0
nchpl      144     5602    0     2398   123     4   119   119     0
8    0
uvmvnodes   72     5340    0        0    98     0    98    98     0
8    0
vnodes     224     5340    0        0   315     0   315   315     0
8    0
namei      1024   17921    0    17921     3     2     1     1     0
8    1
percpumem   96       31    0        0     1     0     1     1     0
8    0
ehcixfer   296      175    0      170     1     0     1     1     0
8    0
scxspl     216    35538    0    35538    22    21     1     8     0
8    1
plimitpl   152       25    0       12     1     0     1     1     0
8    0
sigapl     424      426    0      376     9     2     7     8     0
8    0
futexpl     56     5484    0     5484     2     2     0     1     0
8    0
knotepl    112       99    0       38     2     0     2     2     0
8    0
kqueuepl   168        9    0        0     1     0     1     1     0
8    0
pipepl     336      105    0      105     4     4     0     1     0
8    0
fdescpl    496      395    0      376     5     1     4     5     0
8    0
filepl     152     7300    0     7201     5     0     5     5     0
8    0
lockfpl    104        4    0        4     1     1     0     1     0
8    0
lockfspl    48        2    0        2     1     1     0     1     0
8    0
sessionpl  144       11    0        1     1     0     1     1     0
8    0
pgrppl      48       17    0        7     1     0     1     1     0
8    0
ucredpl     96       62    0       43     1     0     1     1     0
8    0
zombiepl   144      376    0      376     3     3     0     1     0
8    0
processpl  1080     426    0      376     5     0     5     5     0
8    0
procpl     672      487    0      437     7     1     6     6     0
8    0
sockpl     432      292    0      235     7     0     7     7     0
8    0
mcl12k     12288     25    0        0     3     0     3     3     0
8    0
mcl4k      4096       1    0        0     1     0     1     1     0
8    0
mcl2k2     2112     644    0        0    43     0    43    43     0
8    0
mcl2k      2048      21    0        0     3     0     3     3     0
8    0
mtagpl      96       17    0        0     1     0     1     1     0
8    0
mbufpl     256      683    0        0    43     0    43    43     0
8    0
bufpl      280    49145    0    10072  2793     1  2792  2792     0
8    0
anonpl      24   141669    0   136823   234    28   206   231     0
3034  161
amapchunkpl 152   10572    0    10249   106    32    74    99     0
158   56
amappl16   200      826    0      822    22    21     1    15     0
8    0
amappl15   192      120    0      109     1     0     1     1     0
8    0
amappl14   184        9    0        9     2     2     0     1     0
8    0
amappl13   176       33    0       32     1     0     1     1     0
8    0
amappl12   168       88    0       82     4     3     1     4     0
8    0
amappl11   160      103    0       75     2     0     2     2     0
8    0
amappl10   152       11    0       11     2     2     0     1     0
8    0
amappl9    144       26    0       26     1     1     0     1     0
8    0
amappl8    136      899    0      886    13    12     1    13     0
8    0
amappl7    128      108    0      107     1     0     1     1     0
8    0
amappl6    120      353    0      324     5     4     1     4     0
8    0
amappl5    112      190    0      173     3     2     1     3     0
8    0
amappl4    104     1741    0     1695    23    20     3    22     0
8    0
amappl3     96      561    0      546    10     9     1     7     0
8    0
amappl2     88     2974    0     2814    36    30     6    29     0
8    0
amappl1     80    10296    0     9610    20     2    18    19     0
8    0
amappl      88     3970    0     3831    21    15     6    20     0
92    0
dma16384   16384      3    0        3     1     1     0     1     0
8    0
dma4096    4096       7    0        1     1     0     1     1     0
8    0
dma2048    2048      26    0       26    11    10     1     1     0
8    1
dma1024    1024      22    0       22    11    10     1     1     0
8    1
dma512     512      269    0      269    11    10     1     1     0
8    1
dma256     256        7    0        7     1     1     0     1     0
8    0
dma128     128       64    0       64     1     1     0     1     0
8    0
dma64       64       12    0       12    11    10     1     1     0
8    1
dma32       32       12    0       12     1     1     0     1     0
8    0
dma16       16        1    0        1     1     1     0     1     0
8    0
aobjpl      64        2    0        0     1     0     1     1     0
8    0
uaddrrnd    24      395    0      376     1     0     1     1     0
8    0
uaddrbest   32        2    0        0     1     0     1     1     0
8    0
uaddr       24      395    0      376     1     0     1     1     0
8    0
vmmpekpl   168    11751    0    11728     2     0     2     2     0
8    0
vmmpepl    168    47352    0    45611   332    35   297   330     0
357  204
vmsppl     368      394    0      376     3     0     3     3     0
8    0
rwobjpl     56    16755    0    15640    70    49    21    70     0
8    0
pdppl      4096     798    0      752    86    40    46    72     0
8    0
pvpl        32   622565    0   611384  1291   821   470  1260     0
265  352
pmappl     232      394    0      376     2     0     2     2     0
8    0
extentpl    40      271    0      182     1     0     1     1     0
8    0
phpool     112      445    0       82    11     0    11    11     0
8    0



ddb{1}> ps
   PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
  5155  486169      1      0  3    0x100083  ttyin         ksh
 21708  285930      1      0  3    0x100098  poll          cron
 62392   40931  25406    720  3        0x90  kqread        lldpd
 25406   20242      1      0  3        0x80  netio         lldpd
  5100  287713  79347     95  3    0x100092  kqread        smtpd
 46028  334535  79347    103  3    0x100092  kqread        smtpd
 54804  375444  79347     95  3    0x100092  kqread        smtpd
  3303  249576  79347     95  3    0x100092  kqread        smtpd
 97819  271021  79347     95  3    0x100092  kqread        smtpd
  3054  297952  79347     95  3    0x100092  kqread        smtpd
 79347  446793      1      0  3    0x100080  kqread        smtpd
 12533  505387      1      0  3        0x80  select        sshd
 30114   19756      1      0  3    0x100080  poll          ntpd
  7481   97074  94918     83  3    0x100092  poll          ntpd
 94918  433415      1     83  3    0x100092  poll          ntpd
 32160  146866  10207     73  3    0x100090  kqread        syslogd
 10207  206610      1      0  3    0x100082  netio         syslogd
 57814   94760      0      0  3     0x14200  bored         smr
 17640  263297      0      0  3     0x14200  pgzero        zerothread
 65686  290981      0      0  3     0x14200  aiodoned      aiodoned
 63357  193771      0      0  3     0x14200  syncer        update
  2489  279176      0      0  3     0x14200  cleaner       cleaner
 43206  463626      0      0  3     0x14200  reaper        reaper
 12693   95421      0      0  3     0x14200  pgdaemon      pagedaemon
 85510  318479      0      0  3     0x14200  bored         crynlk
 49385   33947      0      0  3     0x14200  bored         crypto
  3669  424498      0      0  3     0x14200  usbtsk        usbtask
 16215  282320      0      0  3     0x14200  usbatsk       usbatsk
 72714  194617      0      0  3  0x40014200  acpi0         acpi0
 97166  158891      0      0  7  0x40014200                idle11
 78318  444428      0      0  7  0x40014200                idle10
 25961   75337      0      0  7  0x40014200                idle9
 11485  363757      0      0  7  0x40014200                idle8
 14568  345381      0      0  7  0x40014200                idle7
 24392   17964      0      0  7  0x40014200                idle6
 62703  151012      0      0  7  0x40014200                idle5
 14769   92091      0      0  3  0x40014200                idle4
 22913  159035      0      0  3  0x40014200                idle3
 38081   22603      0      0  3  0x40014200                idle2
 42810  217564      0      0  3  0x40014200                idle1
 29482   62812      0      0  3     0x14200  bored         sensors
*58019  211939      0      0  7     0x14200                softnet
 68166  173790      0      0  7     0x14200                softnet
 46127   45539      0      0  7     0x14200                softnet
 28782  358228      0      0  7     0x14200                softnet
 57025  242861      0      0  3     0x14200  bored         systqmp
 84257  166355      0      0  3     0x14200  bored         systq
  3048  213029      0      0  3  0x40014200  bored         softclock
 37973   15869      0      0  7  0x40014200                idle0
     1  164030      0      0  3        0x82  wait          init
     0       0     -1      0  3     0x10200  scheduler     swapper


ddb{1}> trace /p 0x58019

and the box freeze

Reply via email to