Re: em(4) multiqueue
On Fri, Apr 14, 2023 at 10:26:14AM +0800, Kevin Lo wrote: > On Thu, Apr 13, 2023 at 01:30:36PM -0500, Brian Conway wrote: > > Reviving this thread, apologies for discontinuity in mail readers: > > https://marc.info/?t=16564219358 > > > > After rebasing on 7.3, my results have mirrored Hrvoje's testing at > > the end of that thread. No issues with throughput, unusual latency, > > or reliability. `vmstat -i` shows some level of balancing between > > the queues. I've been testing on as many em(4) systems as I have > > access to, some manually, some in a packet forwarder/firewall > > scenarios: > > Last time I tested (about a year go) on I211, rx locked up if I tried > something > like iperf3 or tcpbench. Don't know if you have a similar problem. I rebased the rest to current and tested it with tcpbench between the following interfaces: em0 at pci7 dev 0 function 0 "Intel 82580" rev 0x01, msix, 4 queues, address 90:e2:ba:df:d5:2c em0 at pci5 dev 0 function 0 "Intel I350" rev 0x01, msix, 8 queues, address 00:25:90:eb:b3:c2 After a second the connection stucked. As far as I can see, the sending side got a problem. ot45# tcpbench 192.168.99.3 elapsed_ms bytes mbps bwidth 1012 14574120 115.210 100.00% Conn: 1 Mbps: 115.210 Peak Mbps: 115.210 Avg Mbps: 115.210 2022 00.000-nan% ... ot46# tcpbench -s elapsed_ms bytes mbps bwidth 1017 14313480 112.594 100.00% Conn: 1 Mbps: 112.594 Peak Mbps: 112.594 Avg Mbps: 112.594 2027 00.000-nan% ... ot45# netstat -nf inet -p tcp Active Internet connections Proto Recv-Q Send-Q Local Address Foreign AddressTCP-State tcp 0 260640 192.168.99.1.18530 192.168.99.3.12345 CLOSING When I retried it, it sometimes work and most times not. kstat tells me, that transmit queues 1 to 3 are oactive and just 0 works: em0:0:txq:0 packets: 4042648 packets bytes: 5310138322 bytes qdrops: 9 packets errors: 0 packets qlen: 0 packets maxqlen: 511 packets oactive: false em0:0:txq:1 packets: 9812 packets bytes: 14846716 bytes qdrops: 0 packets errors: 0 packets qlen: 184 packets maxqlen: 511 packets oactive: true em0:0:txq:2 packets: 690362 packets bytes: 60011484 bytes qdrops: 0 packets errors: 0 packets qlen: 185 packets maxqlen: 511 packets oactive: true em0:0:txq:3 packets: 443181 packets bytes: 43829886 bytes qdrops: 0 packets errors: 0 packets qlen: 198 packets maxqlen: 511 packets oactive: true This is the rebased diff on current i tested: Index: dev/pci/files.pci === RCS file: /cvs/src/sys/dev/pci/files.pci,v retrieving revision 1.361 diff -u -p -r1.361 files.pci --- dev/pci/files.pci 23 Apr 2023 00:20:26 - 1.361 +++ dev/pci/files.pci 25 Apr 2023 11:25:47 - @@ -334,7 +334,7 @@ attach fxp at pci with fxp_pci file dev/pci/if_fxp_pci.cfxp_pci # Intel Pro/1000 -device em: ether, ifnet, ifmedia +device em: ether, ifnet, ifmedia, intrmap, stoeplitz attach em at pci file dev/pci/if_em.c em file dev/pci/if_em_hw.c em Index: dev/pci/if_em.c === RCS file: /cvs/src/sys/dev/pci/if_em.c,v retrieving revision 1.365 diff -u -p -r1.365 if_em.c --- dev/pci/if_em.c 9 Feb 2023 21:21:27 - 1.365 +++ dev/pci/if_em.c 25 Apr 2023 11:25:47 - @@ -247,6 +247,7 @@ int em_intr(void *); int em_allocate_legacy(struct em_softc *); void em_start(struct ifqueue *); int em_ioctl(struct ifnet *, u_long, caddr_t); +int em_rxrinfo(struct em_softc *, struct if_rxrinfo *); void em_watchdog(struct ifnet *); void em_init(void *); void em_stop(void *, int); @@ -309,8 +310,10 @@ int em_setup_queues_msix(struct em_soft int em_queue_intr_msix(void *); int em_link_intr_msix(void *); void em_enable_queue_intr_msix(struct em_queue *); +void em_setup_rss(struct em_softc *); #else #define em_allocate_msix(_sc) (-1) +#define em_setup_rss(_sc) 0 #endif #if NKSTAT > 0 @@ -333,7 +336,6 @@ struct cfdriver em_cd = { }; static int em_smart_pwr_down = FALSE; -int em_enable_msix = 0; /* * Device identification routine @@ -629,12 +631,12 @@ err_pci: void em_start(struct ifqueue *ifq) { + struct em_queue *que = ifq->ifq_softc; struct ifnet *ifp = ifq->ifq_if; struct em_softc *sc = ifp->if_softc; u_int head, free, used; struct mbuf *m; int post = 0; - struct em_que
Re: em(4) multiqueue
On Thu, Apr 13, 2023 at 01:30:36PM -0500, Brian Conway wrote: > > Reviving this thread, apologies for discontinuity in mail readers: > https://marc.info/?t=16564219358 > > After rebasing on 7.3, my results have mirrored Hrvoje's testing at the end > of that thread. No issues with throughput, unusual latency, or reliability. > `vmstat -i` shows some level of balancing between the queues. I've been > testing on as many em(4) systems as I have access to, some manually, some in > a packet forwarder/firewall scenarios: > > em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em1 at pci2 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em2 at pci3 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em3 at pci4 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em4 at pci5 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em5 at pci6 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > > em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address > 00:0d:b9:... > em1 at pci2 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address > 00:0d:b9:... > em2 at pci3 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address > 00:0d:b9:... > > em0 at pci1 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address > 00:0d:b9:... > em1 at pci2 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address > 00:0d:b9:... > em2 at pci3 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address > 00:0d:b9:... Last time I tested (about a year go) on I211, rx locked up if I tried something like iperf3 or tcpbench. Don't know if you have a similar problem. > em0 at pci1 dev 0 function 0 "Intel 82574L" rev 0x00: msi, address > 68:05:ca:... > > The only questions I have are around queue identification. All the specs I've > been able to find indicate the I210 should have 4 queues, did Intel make a > cheaper version with 2 toward the end of production? Or could it be an I211 > masquerading as an I210 (and would that be bad for the driver)? > > Also, > https://www.mouser.com/pdfdocs/Intel_82574L_82574IT_GbE_Controller_brief.pdf > indicates that the 82574L should have 2 queues? > > Anyway, great work, please let me know if there's more I can do to help this > move forward. > > Brian Conway > Lead Software Engineer, Owner > RCE Software, LLC >
Re: em(4) multiqueue
On 2023/04/13 16:45, Sonic wrote: > Is this multiqueue support in 7.3 or does it require patching? > According to Intel the i211 should have 2 queues but I see no msi-x > support in dmesg: > em0 at pci1 dev 0 function 0 "Intel I211" rev 0x03: msi, address It is not committed, there's a diff.
Re: em(4) multiqueue
Is this multiqueue support in 7.3 or does it require patching? According to Intel the i211 should have 2 queues but I see no msi-x support in dmesg: em0 at pci1 dev 0 function 0 "Intel I211" rev 0x03: msi, address Thanks. Chris
Re: em(4) multiqueue
On Thu, Apr 13, 2023, at 2:45 PM, Stuart Henderson wrote: > On 2023/04/13 13:30, Brian Conway wrote: >> Reviving this thread, apologies for discontinuity in mail readers: >> https://marc.info/?t=16564219358 >> >> After rebasing on 7.3, my results have mirrored Hrvoje's testing at the end >> of that thread. No issues with throughput, unusual latency, or reliability. >> `vmstat -i` shows some level of balancing between the queues. I've been >> testing on as many em(4) systems as I have access to, some manually, some in >> a packet forwarder/firewall scenarios: >> >> em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address >> 00:f1:f3:... >> em1 at pci2 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address >> 00:f1:f3:... >> em2 at pci3 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address >> 00:f1:f3:... >> em3 at pci4 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address >> 00:f1:f3:... >> em4 at pci5 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address >> 00:f1:f3:... >> em5 at pci6 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address >> 00:f1:f3:... >> >> em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address >> 00:0d:b9:... >> em1 at pci2 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address >> 00:0d:b9:... >> em2 at pci3 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address >> 00:0d:b9:... >> >> em0 at pci1 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address >> 00:0d:b9:... >> em1 at pci2 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address >> 00:0d:b9:... >> em2 at pci3 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address >> 00:0d:b9:... >> >> em0 at pci1 dev 0 function 0 "Intel 82574L" rev 0x00: msi, address >> 68:05:ca:... >> >> The only questions I have are around queue identification. All the specs >> I've been able to find indicate the I210 should have 4 queues, did Intel >> make a cheaper version with 2 toward the end of production? Or could it be >> an I211 masquerading as an I210 (and would that be bad for the driver)? > > Is it a 2-cpu machine? Ah, you're right. The level of detail I provided was insufficient. >> Also, >> https://www.mouser.com/pdfdocs/Intel_82574L_82574IT_GbE_Controller_brief.pdf >> indicates that the 82574L should have 2 queues? > > No msix in your dmesg excerpt for that one I'll lug that one back out and take a look. Probably safe to assume a misunderstanding on my part. Thanks. -b
Re: em(4) multiqueue
On 2023/04/13 13:30, Brian Conway wrote: > Reviving this thread, apologies for discontinuity in mail readers: > https://marc.info/?t=16564219358 > > After rebasing on 7.3, my results have mirrored Hrvoje's testing at the end > of that thread. No issues with throughput, unusual latency, or reliability. > `vmstat -i` shows some level of balancing between the queues. I've been > testing on as many em(4) systems as I have access to, some manually, some in > a packet forwarder/firewall scenarios: > > em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em1 at pci2 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em2 at pci3 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em3 at pci4 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em4 at pci5 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > em5 at pci6 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address > 00:f1:f3:... > > em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address > 00:0d:b9:... > em1 at pci2 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address > 00:0d:b9:... > em2 at pci3 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address > 00:0d:b9:... > > em0 at pci1 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address > 00:0d:b9:... > em1 at pci2 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address > 00:0d:b9:... > em2 at pci3 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address > 00:0d:b9:... > > em0 at pci1 dev 0 function 0 "Intel 82574L" rev 0x00: msi, address > 68:05:ca:... > > The only questions I have are around queue identification. All the specs I've > been able to find indicate the I210 should have 4 queues, did Intel make a > cheaper version with 2 toward the end of production? Or could it be an I211 > masquerading as an I210 (and would that be bad for the driver)? Is it a 2-cpu machine? > Also, > https://www.mouser.com/pdfdocs/Intel_82574L_82574IT_GbE_Controller_brief.pdf > indicates that the 82574L should have 2 queues? No msix in your dmesg excerpt for that one
Re: em(4) multiqueue
Reviving this thread, apologies for discontinuity in mail readers: https://marc.info/?t=16564219358 After rebasing on 7.3, my results have mirrored Hrvoje's testing at the end of that thread. No issues with throughput, unusual latency, or reliability. `vmstat -i` shows some level of balancing between the queues. I've been testing on as many em(4) systems as I have access to, some manually, some in a packet forwarder/firewall scenarios: em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address 00:f1:f3:... em1 at pci2 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address 00:f1:f3:... em2 at pci3 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address 00:f1:f3:... em3 at pci4 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address 00:f1:f3:... em4 at pci5 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address 00:f1:f3:... em5 at pci6 dev 0 function 0 "Intel I210" rev 0x03, msix, 2 queues, address 00:f1:f3:... em0 at pci1 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address 00:0d:b9:... em1 at pci2 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address 00:0d:b9:... em2 at pci3 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, address 00:0d:b9:... em0 at pci1 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address 00:0d:b9:... em1 at pci2 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address 00:0d:b9:... em2 at pci3 dev 0 function 0 "Intel I211" rev 0x03, msix, 2 queues, address 00:0d:b9:... em0 at pci1 dev 0 function 0 "Intel 82574L" rev 0x00: msi, address 68:05:ca:... The only questions I have are around queue identification. All the specs I've been able to find indicate the I210 should have 4 queues, did Intel make a cheaper version with 2 toward the end of production? Or could it be an I211 masquerading as an I210 (and would that be bad for the driver)? Also, https://www.mouser.com/pdfdocs/Intel_82574L_82574IT_GbE_Controller_brief.pdf indicates that the 82574L should have 2 queues? Anyway, great work, please let me know if there's more I can do to help this move forward. Brian Conway Lead Software Engineer, Owner RCE Software, LLC
Re: em(4) multiqueue
On 15.8.2022. 20:51, Hrvoje Popovski wrote: > On 12.8.2022. 22:15, Hrvoje Popovski wrote: >> Hi, >> >> I'm testing forwarding over >> >> em0 at pci7 dev 0 function 0 "Intel 82576" rev 0x01, msix, 4 queues, >> em1 at pci7 dev 0 function 1 "Intel 82576" rev 0x01, msix, 4 queues, >> em2 at pci8 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, >> em3 at pci9 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, >> em4 at pci12 dev 0 function 0 "Intel I350" rev 0x01, msix, 4 queues, >> em5 at pci12 dev 0 function 1 "Intel I350" rev 0x01, msix, 4 queues, > I've managed to get linux pktgen to send traffic on all 6 em interfaces > at that same time, and box seems to work just fine. Some systat, vmstat > and kstat details in attachment while traffic is flowing over that box. Hi, after 95 day in production with this diff and i350 and everything works as expected. I'm sending this because it's time to upgrade :) Is it maybe time to put this diff in ? ix0 at pci5 dev 0 function 0 "Intel X540T" rev 0x01, msix, 8 queues, address a0:36:9f:29:f3:28 ix1 at pci5 dev 0 function 1 "Intel X540T" rev 0x01, msix, 8 queues, address a0:36:9f:29:f3:2a em0 at pci6 dev 0 function 0 "Intel I350" rev 0x01, msix, 8 queues, address ac:1f:6b:14:bd:b2 em1 at pci6 dev 0 function 1 "Intel I350" rev 0x01, msix, 8 queues, address ac:1f:6b:14:bd:b3 fw2# uptime 6:34PM up 95 days, 19:26, 1 user, load averages: 0.00, 0.00, 0.00 fw2# vmstat -i interrupt total rate irq0/clock 6622294171 799 irq0/ipi 8263089839 998 irq96/acpi0 10 irq114/ix0:0514761687 62 irq115/ix0:1510189468 61 irq116/ix0:2522691117 63 irq117/ix0:3531638415 64 irq118/ix0:4534116996 64 irq119/ix0:5511162669 61 irq120/ix0:6535267806 64 irq121/ix0:7519707637 62 irq122/ix0 20 irq99/xhci0680 irq100/ehci0 190 irq132/em0:0498689640 60 irq133/em0:1516744073 62 irq134/em0:2520784714 62 irq135/em0:3512596405 61 irq136/em0:4521988376 63 irq137/em0:5513939246 62 irq138/em0:6517184525 62 irq139/em0:7509781661 61 irq140/em0 20 irq141/em1:0216273893 26 irq143/em1:2283094667 34 irq148/em1:520 irq151/em1 180 irq100/ehci1 190 irq103/ahci0 50490680 Total 23681046204 2860
Re: em(4) multiqueue
On 12.8.2022. 22:15, Hrvoje Popovski wrote: > Hi, > > I'm testing forwarding over > > em0 at pci7 dev 0 function 0 "Intel 82576" rev 0x01, msix, 4 queues, > em1 at pci7 dev 0 function 1 "Intel 82576" rev 0x01, msix, 4 queues, > em2 at pci8 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, > em3 at pci9 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, > em4 at pci12 dev 0 function 0 "Intel I350" rev 0x01, msix, 4 queues, > em5 at pci12 dev 0 function 1 "Intel I350" rev 0x01, msix, 4 queues, I've managed to get linux pktgen to send traffic on all 6 em interfaces at that same time, and box seems to work just fine. Some systat, vmstat and kstat details in attachment while traffic is flowing over that box. irq124/em0:0701061017 7450 irq125/em0:1700477475 7444 irq126/em0:2700518530 7445 irq127/em0:3700477219 7444 irq128/em0 120 irq129/em1:0702693602 7468 irq130/em1:1702621154 7467 irq131/em1:2702638755 7467 irq132/em1:3702619278 7467 irq133/em1 80 irq134/em2:0700792107 7448 irq135/em2:1685857158 7289 irq136/em2:2685987301 7290 irq137/em2:3685853293 7289 irq138/em2 120 irq139/em3:0702784432 7469 irq140/em3:1702673600 7468 irq141/em3:2702692900 7468 irq142/em3:3702670362 7468 irq143/em3 80 irq146/em4:0691767956 7352 irq147/em4:1687629590 7308 irq148/em4:2687675100 7308 irq149/em4:3687627987 7308 irq150/em4 120 irq151/em5:0702655585 7467 irq152/em5:1702482994 7466 irq153/em5:2702502382 7466 irq154/em5:3702481315 7466 irq155/em5 80NAME LEN IDLE NGC CPU REQ REL LREQ LREL knotepl 80 140 8680 87074 6 1 4181 42310 6 2 3659 36880 3 3 3409 34430 3 mbufpl 628* 64230 31819767309 36433364464 19452409 46897323 1 32515621331 29647742759 38355243 15286334 2 32477451749 32774718213 25059389 35077344 3 32535508383 30493295832 36129475 21741989 mcl12k 8000000 0 1000 0 2000 0 3000 0 mcl16k 8000000 0 1000 0 2000 0 3000 0 mcl2k8 5200 21370105468738 74153346 3915988 1562 1650734300439537696 26402919 3343 2617420319419639380 24727419 4801 3665185877 1105501145 26117590 81156997 mcl2k2 601* 61400 31683334392 36327982542 19966506 48295824 1 31678859095 29022314473 38321812 16928852 2 31682996160 32178017835 24908639 37514119 3 31678642505 29196245479 37458168 17922524 mcl4k8000000 0 1000 0 2000 0 3000 0 mcl64k 8000000 0 1000 0 2
Re: em(4) multiqueue
On 28.6.2022. 15:11, Jonathan Matthew wrote: > This adds the (not quite) final bits to em(4) to enable multiple rx/tx queues. > Note that desktop/laptop models (I218, I219 etc.) do not support multiple > queues, > so this only really applies to servers and network appliances (including > APU2). > > It also removes the 'em_enable_msix' variable, in favour of using MSI-X on > devices > that support multiple queues and MSI or INTX everywhere else. > > I've tested this with an I350 on amd64 and arm64, where it works as expected, > and > with the I218-LM in my laptop where it does nothing (as expected). > More testing is welcome, especially in forwarding environments. Hi, I'm testing forwarding over em0 at pci7 dev 0 function 0 "Intel 82576" rev 0x01, msix, 4 queues, em1 at pci7 dev 0 function 1 "Intel 82576" rev 0x01, msix, 4 queues, em2 at pci8 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, em3 at pci9 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues, em4 at pci12 dev 0 function 0 "Intel I350" rev 0x01, msix, 4 queues, em5 at pci12 dev 0 function 1 "Intel I350" rev 0x01, msix, 4 queues, and it seems that plain forwarding works as expected. I'm sending traffic from em0 to em1, from em2 to em3 and from em4 to em5, em6 is for ssh ... irq124/em0:0 1233974 1316 irq125/em0:1 1233943 1316 irq126/em0:2 1233942 1316 irq127/em0:3 1233944 1316 irq128/em0 20 irq129/em1:0 1021586 1090 irq132/em1:320 irq133/em1 40 irq98/xhci0940 irq99/ehci0190 irq134/em2:0 466894 498 irq135/em2:1 466846 498 irq136/em2:2 466846 498 irq137/em2:3 466846 498 irq138/em2 20 irq139/em3:0 467019 498 irq143/em3 20 irq146/em4:0 1192252 1272 irq147/em4:1 1192213 1272 irq148/em4:2 1192211 1272 irq149/em4:3 1192212 1272 irq150/em4 20 irq151/em5:0 1192354 1272 irq155/em5 20 irq156/em6:0 29363 irq157/em6:1 840 irq158/em6:2 320 irq159/em6:3 300 irq160/em6 20 OpenBSD 7.2-beta (GENERIC.MP) #0: Fri Aug 12 12:50:45 CEST 2022 r...@smc4.srce.hr:/sys/arch/amd64/compile/GENERIC.MP real mem = 17052663808 (16262MB) avail mem = 16518463488 (15753MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xed9b0 (48 entries) bios0: vendor American Megatrends Inc. version "2.3" date 05/07/2021 bios0: Supermicro Super Server acpi0 at bios0: ACPI 5.0 acpi0: sleep states S0 S4 S5 acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG UEFI DBG2 HPET WDDT SSDT SSDT SSDT PRAD DMAR HEST BERT ERST EINJ acpi0: wakeup devices IP2P(S4) EHC1(S4) EHC2(S4) RP07(S4) RP08(S4) BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4) BR3C(S4) BR3D(S4) RP01(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz, 2200.34 MHz, 06-56-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz, 2200.01 MHz, 06-56-03 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,M
Re: em(4) multiqueue
> On 2 Jul 2022, at 08:44, Hrvoje Popovski wrote: > > On 28.6.2022. 15:11, Jonathan Matthew wrote: >> This adds the (not quite) final bits to em(4) to enable multiple rx/tx >> queues. >> Note that desktop/laptop models (I218, I219 etc.) do not support multiple >> queues, >> so this only really applies to servers and network appliances (including >> APU2). >> >> It also removes the 'em_enable_msix' variable, in favour of using MSI-X on >> devices >> that support multiple queues and MSI or INTX everywhere else. >> >> I've tested this with an I350 on amd64 and arm64, where it works as >> expected, and >> with the I218-LM in my laptop where it does nothing (as expected). >> More testing is welcome, especially in forwarding environments. > > > Hi, > > I'm testing this diff in forwarding setup where source is 10.113.0/24 > connected to em2 and destination is 10.114.0/24 connected to em3. I'm > doing random source and destination per ip. > > dmesg: > em2 at pci6 dev 0 function 2 "Intel I350" rev 0x01, msix, 8 queues > em3 at pci6 dev 0 function 3 "Intel I350" rev 0x01, msix, 8 queues > > netstat: > 10.113.0/24192.168.113.11 UGS00 - 8 em2 > 10.114.0/24192.168.114.11 UGS0 404056853 - 8 em3 > > > ifconfig: > em2: flags=8843 mtu 1500 >lladdr 40:f2:e9:ec:b4:14 >index 5 priority 0 llprio 3 >media: Ethernet autoselect (1000baseT full-duplex,master) >status: active >inet 192.168.113.1 netmask 0xff00 broadcast 192.168.113.255 > em3: flags=8843 mtu 1500 >lladdr 40:f2:e9:ec:b4:15 >index 6 priority 0 llprio 3 >media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause) >status: active >inet 192.168.114.1 netmask 0xff00 broadcast 192.168.114.255 > > > with vmstat -i > irq160/em2:0 4740972 3538 > irq161/em2:1 4740979 3538 > irq162/em2:2 4740977 3538 > irq163/em2:3 4740978 3538 > irq164/em2:4 4740965 3538 > irq165/em2:5 4740972 3538 > irq166/em2:6 4740971 3538 > irq167/em2:7 4740965 3538 > irq168/em2 20 > irq169/em3:0 4741258 3538 > irq177/em3 20 > > > should I see 8 queues on em3 as on em2 ? em(4) isn't populating the mbuf flowid field with the rss hash value the chip calculates when it receives packets, so there's no flow identifier for the network stack to use to assign packets to output queues on the way out. this means everything lands on the default (0th) queue. cheers, dlg > > x3550m4# tcpdump -ni em3 > tcpdump: listening on em3, link-type EN10MB > 00:39:26.663617 10.113.0.230.9 > 10.114.0.154.9: udp 18 > 00:39:26.663618 10.113.0.176.9 > 10.114.0.3.9: udp 18 > 00:39:26.663619 10.113.0.37.9 > 10.114.0.7.9: udp 18 > 00:39:26.663620 10.113.0.200.9 > 10.114.0.197.9: udp 18 > 00:39:26.663620 10.113.0.37.9 > 10.114.0.230.9: udp 18 > 00:39:26.663621 10.113.0.95.9 > 10.114.0.216.9: udp 18 > 00:39:26.663622 10.113.0.8.9 > 10.114.0.187.9: udp 18 > 00:39:26.663623 10.113.0.56.9 > 10.114.0.107.9: udp 18 > 00:39:26.663624 10.113.0.4.9 > 10.114.0.39.9: udp 18 > 00:39:26.663624 10.113.0.244.9 > 10.114.0.188.9: udp 18 > 00:39:26.663625 10.113.0.166.9 > 10.114.0.15.9: udp 18 > 00:39:26.663626 10.113.0.7.9 > 10.114.0.78.9: udp 18 > 00:39:26.663627 10.113.0.147.9 > 10.114.0.202.9: udp 18 > 00:39:26.663628 10.113.0.144.9 > 10.114.0.184.9: udp 18 > 00:39:26.663628 10.113.0.221.9 > 10.114.0.100.9: udp 18 > 00:39:26.663630 10.113.0.69.9 > 10.114.0.231.9: udp 18 > 00:39:26.663648 10.113.0.71.9 > 10.114.0.64.9: udp 18 > > > vmstat -iz > irq160/em2:0 4740972 3501 > irq161/em2:1 4740979 3501 > irq162/em2:2 4740977 3501 > irq163/em2:3 4740978 3501 > irq164/em2:4 4740965 3501 > irq165/em2:5 4740972 3501 > irq166/em2:6 4740971 3501 > irq167/em2:7 4740965 3501 > irq168/em2 20 > irq169/em3:0 4741258 3501 > irq170/em3:100 > irq171/em3:200 > irq172/em3:300 > irq173/em3:400 > irq174/em3:500 > irq175/em3:600 > irq176/em3:700 > irq177/em3 20 >
Re: em(4) multiqueue
On 28.6.2022. 15:11, Jonathan Matthew wrote: > This adds the (not quite) final bits to em(4) to enable multiple rx/tx queues. > Note that desktop/laptop models (I218, I219 etc.) do not support multiple > queues, > so this only really applies to servers and network appliances (including > APU2). > > It also removes the 'em_enable_msix' variable, in favour of using MSI-X on > devices > that support multiple queues and MSI or INTX everywhere else. > > I've tested this with an I350 on amd64 and arm64, where it works as expected, > and > with the I218-LM in my laptop where it does nothing (as expected). > More testing is welcome, especially in forwarding environments. Hi, I'm testing this diff in forwarding setup where source is 10.113.0/24 connected to em2 and destination is 10.114.0/24 connected to em3. I'm doing random source and destination per ip. dmesg: em2 at pci6 dev 0 function 2 "Intel I350" rev 0x01, msix, 8 queues em3 at pci6 dev 0 function 3 "Intel I350" rev 0x01, msix, 8 queues netstat: 10.113.0/24192.168.113.11 UGS00 - 8 em2 10.114.0/24192.168.114.11 UGS0 404056853 - 8 em3 ifconfig: em2: flags=8843 mtu 1500 lladdr 40:f2:e9:ec:b4:14 index 5 priority 0 llprio 3 media: Ethernet autoselect (1000baseT full-duplex,master) status: active inet 192.168.113.1 netmask 0xff00 broadcast 192.168.113.255 em3: flags=8843 mtu 1500 lladdr 40:f2:e9:ec:b4:15 index 6 priority 0 llprio 3 media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause) status: active inet 192.168.114.1 netmask 0xff00 broadcast 192.168.114.255 with vmstat -i irq160/em2:0 4740972 3538 irq161/em2:1 4740979 3538 irq162/em2:2 4740977 3538 irq163/em2:3 4740978 3538 irq164/em2:4 4740965 3538 irq165/em2:5 4740972 3538 irq166/em2:6 4740971 3538 irq167/em2:7 4740965 3538 irq168/em2 20 irq169/em3:0 4741258 3538 irq177/em3 20 should I see 8 queues on em3 as on em2 ? x3550m4# tcpdump -ni em3 tcpdump: listening on em3, link-type EN10MB 00:39:26.663617 10.113.0.230.9 > 10.114.0.154.9: udp 18 00:39:26.663618 10.113.0.176.9 > 10.114.0.3.9: udp 18 00:39:26.663619 10.113.0.37.9 > 10.114.0.7.9: udp 18 00:39:26.663620 10.113.0.200.9 > 10.114.0.197.9: udp 18 00:39:26.663620 10.113.0.37.9 > 10.114.0.230.9: udp 18 00:39:26.663621 10.113.0.95.9 > 10.114.0.216.9: udp 18 00:39:26.663622 10.113.0.8.9 > 10.114.0.187.9: udp 18 00:39:26.663623 10.113.0.56.9 > 10.114.0.107.9: udp 18 00:39:26.663624 10.113.0.4.9 > 10.114.0.39.9: udp 18 00:39:26.663624 10.113.0.244.9 > 10.114.0.188.9: udp 18 00:39:26.663625 10.113.0.166.9 > 10.114.0.15.9: udp 18 00:39:26.663626 10.113.0.7.9 > 10.114.0.78.9: udp 18 00:39:26.663627 10.113.0.147.9 > 10.114.0.202.9: udp 18 00:39:26.663628 10.113.0.144.9 > 10.114.0.184.9: udp 18 00:39:26.663628 10.113.0.221.9 > 10.114.0.100.9: udp 18 00:39:26.663630 10.113.0.69.9 > 10.114.0.231.9: udp 18 00:39:26.663648 10.113.0.71.9 > 10.114.0.64.9: udp 18 vmstat -iz irq160/em2:0 4740972 3501 irq161/em2:1 4740979 3501 irq162/em2:2 4740977 3501 irq163/em2:3 4740978 3501 irq164/em2:4 4740965 3501 irq165/em2:5 4740972 3501 irq166/em2:6 4740971 3501 irq167/em2:7 4740965 3501 irq168/em2 20 irq169/em3:0 4741258 3501 irq170/em3:100 irq171/em3:200 irq172/em3:300 irq173/em3:400 irq174/em3:500 irq175/em3:600 irq176/em3:700 irq177/em3 20
Re: em(4) multiqueue
On 2022/06/29 13:19, Stuart Henderson wrote: > On 2022/06/28 23:11, Jonathan Matthew wrote: > > This adds the (not quite) final bits to em(4) to enable multiple rx/tx > > queues. > > Note that desktop/laptop models (I218, I219 etc.) do not support multiple > > queues, > > so this only really applies to servers and network appliances (including > > APU2). > > > > It also removes the 'em_enable_msix' variable, in favour of using MSI-X on > > devices > > that support multiple queues and MSI or INTX everywhere else. > > > > I've tested this with an I350 on amd64 and arm64, where it works as > > expected, and > > with the I218-LM in my laptop where it does nothing (as expected). > > More testing is welcome, especially in forwarding environments. > > Doesn't break things but doesn't do anything on i386 (I guess there's no > MSI-X?) On amd64 on a similar machine, it works I guess it maybe related to this in ppb.c #ifdef __i386__ if (pci_intr_map(pa, &ih) == 0) sc->sc_intrhand = pci_intr_establish(pc, ih, IPL_BIO, ppb_intr, sc, self->dv_xname); #else if (pci_intr_map_msi(pa, &ih) == 0 || pci_intr_map(pa, &ih) == 0) sc->sc_intrhand = pci_intr_establish(pc, ih, IPL_BIO, ppb_intr, sc, self->dv_xname); #endif