On 2/14/20 4:21 AM, Andrey V. Elsukov wrote:
On 13.02.2020 06:21, Rudy wrote:

I'm having issues with a box that is acting as a BGP router for my
network.  3 Chelsio cards, two T5 and one T6.  It was working great
until I turned up our first port on the T6.  It seems like traffic
passing in from a T5 card and out the T6 causes a really high load (and
high interrupts).

Traffic (not that much, right?)

      Dev  RX bps    TX bps    RX PPS    TX PPS Error
      cc0       0         0         0         0         0
      cc1    2212 M       7 M     250 k       6 k 0 (100Gbps uplink,
filtering inbound routes to keep TX low)
     cxl0     287 k    2015 M     353       244 k 0   (our network)
     cxl1     940 M    3115 M     176 k     360 k 0 (our network)
     cxl2     634 M    1014 M     103 k     128 k 0 (our network)
     cxl3       1 k      16 M       1         4 k       0
     cxl4       0         0         0         0         0
     cxl5       0         0         0         0         0
     cxl6    2343 M     791 M     275 k     137 k 0 (IX , part of lagg0)
     cxl7    1675 M     762 M     215 k     133 k 0 (IX , part of lagg0)
     ixl0     913 k      18 M       0         0 0
     ixl1       1 M      30 M       0         0         0
    lagg0    4019 M    1554 M     491 k     271 k       0
    lagg1       1 M      48 M       0         0         0
FreeBSD 12.1-STABLE orange                 976 Bytes/Packetavg
  1:42PM  up 13:25, 5 users, load averages: 9.38, 10.43, 9.827
Hi,

did you try to use pmcstat to determine what is the heaviest task for
your system?

# kldload hwpmc
# pmcstat -S inst_retired.any -Tw1


PMC: [inst_retired.any] Samples: 168557 (100.0%) , 2575 unresolved
Key: q => exiting...
%SAMP IMAGE      FUNCTION             CALLERS
 16.6 kernel     sched_idletd         fork_exit
 14.7 kernel     cpu_search_highest   cpu_search_highest:12.4 sched_switch:1.4 sched_idletd:0.9  10.5 kernel     cpu_search_lowest    cpu_search_lowest:9.6 sched_pickcpu:0.9
  4.2 kernel     eth_tx               drain_ring
  3.4 kernel     rn_match             fib4_lookup_nh_basic
  2.4 kernel     lock_delay           __mtx_lock_sleep
  1.9 kernel     mac_ifnet_check_tran ether_output


Then capture several first lines from the output and quit using 'q'.

Do you use some firewall? Also, can you show the snapshot from the `top
-HPSIzts1` output.


last pid: 28863;  load averages:  9.30, 10.33, 10.56                                up 0+14:16:08  14:53:23
817 threads:   25 running, 586 sleeping, 206 waiting
CPU 0:   0.8% user,  0.0% nice,  6.2% system,  0.0% interrupt, 93.0% idle
CPU 1:   2.4% user,  0.0% nice,  0.0% system,  7.9% interrupt, 89.8% idle
CPU 2:   0.0% user,  0.0% nice,  0.8% system,  7.1% interrupt, 92.1% idle
CPU 3:   1.6% user,  0.0% nice,  0.0% system, 10.2% interrupt, 88.2% idle
CPU 4:   0.0% user,  0.0% nice,  0.0% system,  9.4% interrupt, 90.6% idle
CPU 5:   0.8% user,  0.0% nice,  0.8% system, 20.5% interrupt, 78.0% idle
CPU 6:   1.6% user,  0.0% nice,  0.0% system,  5.5% interrupt, 92.9% idle
CPU 7:   0.0% user,  0.0% nice,  0.0% system,  3.1% interrupt, 96.9% idle
CPU 8:   0.8% user,  0.0% nice,  0.8% system,  7.1% interrupt, 91.3% idle
CPU 9:   0.0% user,  0.0% nice,  0.8% system,  9.4% interrupt, 89.8% idle
CPU 10:  0.0% user,  0.0% nice,  0.0% system, 35.4% interrupt, 64.6% idle
CPU 11:  0.0% user,  0.0% nice,  0.0% system, 36.2% interrupt, 63.8% idle
CPU 12:  0.0% user,  0.0% nice,  0.0% system, 38.6% interrupt, 61.4% idle
CPU 13:  0.0% user,  0.0% nice,  0.0% system, 49.6% interrupt, 50.4% idle
CPU 14:  0.0% user,  0.0% nice,  0.0% system, 46.5% interrupt, 53.5% idle
CPU 15:  0.0% user,  0.0% nice,  0.0% system, 32.3% interrupt, 67.7% idle
CPU 16:  0.0% user,  0.0% nice,  0.0% system, 46.5% interrupt, 53.5% idle
CPU 17:  0.0% user,  0.0% nice,  0.0% system, 56.7% interrupt, 43.3% idle
CPU 18:  0.0% user,  0.0% nice,  0.0% system, 31.5% interrupt, 68.5% idle
CPU 19:  0.0% user,  0.0% nice,  0.8% system, 34.6% interrupt, 64.6% idle
Mem: 636M Active, 1159M Inact, 5578M Wired, 24G Free
ARC: 1430M Total, 327M MFU, 589M MRU, 32K Anon, 13M Header, 502M Other
     268M Compressed, 672M Uncompressed, 2.51:1 Ratio
Swap: 4096M Total, 4096M Free

  PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
   12 root        -92    -     0B  3376K WAIT    13  41:13  12.86% intr{irq358: t5nex0:2a1}    12 root        -92    -     0B  3376K WAIT    12  48:08  12.77% intr{irq347: t5nex0:1a6}    12 root        -92    -     0B  3376K CPU13   13  47:40  11.96% intr{irq348: t5nex0:1a7}    12 root        -92    -     0B  3376K WAIT    17  43:46  11.38% intr{irq342: t5nex0:1a1}    12 root        -92    -     0B  3376K WAIT    14  29:17  10.70% intr{irq369: t5nex0:2ac}    12 root        -92    -     0B  3376K WAIT    11  47:55   9.85% intr{irq428: t5nex1:2a5}    12 root        -92    -     0B  3376K WAIT    16  46:11   9.22% intr{irq351: t5nex0:1aa}    12 root        -92    -     0B  3376K WAIT    19  42:28   9.04% intr{irq344: t5nex0:1a3}    12 root        -92    -     0B  3376K WAIT    16  46:45   8.82% intr{irq341: t5nex0:1a0}    12 root        -92    -     0B  3376K RUN     11  48:04   8.33% intr{irq356: t5nex0:1af}    12 root        -92    -     0B  3376K WAIT    10  46:24   8.32% intr{irq355: t5nex0:1ae}    12 root        -92    -     0B  3376K WAIT    10  42:03   8.32% intr{irq345: t5nex0:1a4}    12 root        -92    -     0B  3376K WAIT    14  36:34   8.29% intr{irq441: t5nex1:3a2}    12 root        -92    -     0B  3376K WAIT    19  46:14   8.21% intr{irq354: t5nex0:1ad}    12 root        -92    -     0B  3376K WAIT    14  47:29   8.13% intr{irq349: t5nex0:1a8}    12 root        -92    -     0B  3376K WAIT    11  40:25   7.91% intr{irq346: t5nex0:1a5}    12 root        -92    -     0B  3376K WAIT    15  49:33   7.62% intr{irq350: t5nex0:1a9}    12 root        -92    -     0B  3376K WAIT     5  45:37   7.57% intr{irq322: t6nex0:1af}    12 root        -92    -     0B  3376K WAIT    18  45:41   7.43% intr{irq353: t5nex0:1ac}    12 root        -92    -     0B  3376K WAIT    17  36:43   7.34% intr{irq434: t5nex1:2ab}    12 root        -92    -     0B  3376K WAIT    17  33:30   7.11% intr{irq424: t5nex1:2a1}    12 root        -92    -     0B  3376K WAIT     4  31:43   7.02% intr{irq312: t6nex0:1a5}    12 root        -92    -     0B  3376K WAIT    16  35:01   6.95% intr{irq433: t5nex1:2aa}    12 root        -92    -     0B  3376K WAIT    17  47:03   6.84% intr{irq352: t5nex0:1ab}    12 root        -92    -     0B  3376K WAIT    18  41:33   6.73% intr{irq343: t5nex0:1a2}    12 root        -92    -     0B  3376K WAIT     9  37:02   6.42% intr{irq317: t6nex0:1aa}    12 root        -92    -     0B  3376K WAIT    10  32:22   6.40% intr{irq427: t5nex1:2a4}




Thanks.  I did change the chelsio_affinity today to get the cards to bind IRQs to CPU cores in the same numa-domain.  Still, load seems a bit high when using the t6 card compared to just using the T5 cards.

_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to