Hi All,
I'm really struggling to get on top of a networking issue in oi_151a4
involving multicast. I'm building a Video-on-Demand server receiving
tons of multicast traffic (50k pkts/s) in about 100 multicast streams on
a dual 16-core Opteron (Dell R715) system. Looking at mpstat, I can see
something like:
CPU xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
21 0 1102 33 119 4 23 95 0 853 1 9 0 90
22 0 1063 23 78 1 20 68 0 663 1 5 0 95
23 0 4039 87 5961 2 81 63 0 1999 3 74 0 24 (!!)
24 0 4142 127 6110 4 84 105 0 2902 4 30 0 67
25 0 1152 54 150 2 32 79 0 1435 2 7 0 90
A dtrace of the busy CPU looks as follows:
# dtrace -n 'profile-997hz /cpu == 23/ { @[stack()] = count(); }'
ip`ip_fanout_udp_multi_v4+0xb1
ip`ip_fanout_v4+0xbbb
ip`ip_input_multicast_v4+0xb1
ip`ire_recv_multicast_v4+0x2ee
ip`ill_input_short_v4+0x6ce
ip`ip_input+0x23b
dls`i_dls_link_rx+0x2e7
mac`mac_rx_deliver+0x5d
mac`mac_rx_soft_ring_drain+0xdf
mac`mac_soft_ring_worker+0x111
unix`thread_start+0x8
1587
ip`ip_fanout_udp_multi_v4+0x290
ip`ip_fanout_v4+0xbbb
ip`ip_input_multicast_v4+0xb1
ip`ire_recv_multicast_v4+0x2ee
ip`ill_input_short_v4+0x6ce
ip`ip_input+0x23b
dls`i_dls_link_rx+0x2e7
mac`mac_rx_deliver+0x5d
mac`mac_rx_soft_ring_drain+0xdf
mac`mac_soft_ring_worker+0x111
unix`thread_start+0x8
1820
ip`ilm_lookup+0x26
ip`ill_hasmembers_v6+0x45
ip`ill_hasmembers_v4+0x32
ip`ire_recv_multicast_v4+0x1ee
ip`ill_input_short_v4+0x6ce
ip`ip_input+0x23b
dls`i_dls_link_rx+0x2e7
mac`mac_rx_deliver+0x5d
mac`mac_rx_soft_ring_drain+0xdf
mac`mac_soft_ring_worker+0x111
unix`thread_start+0x8
2123
unix`mach_cpu_idle+0x6
unix`cpu_idle+0xaf
unix`cpu_idle_adaptive+0x19
unix`idle+0x114
unix`thread_start+0x8
7855
This is with just 60 multicast streams and 25k pkts/s - one CPU core is
being disproportionately stressed out and as a consquence, any more
stressful system operation (e.g. a ZFS cache flush) causes it to drop
packets. I tried upgrading the bnx driver to the latest from Broadcom,
switching from the two embedded dual-port bnx NICs to a separate plug-in
quad-port igb-based NIC, disabling port aggregation, VLANs and
everything, but I can't get the Rx side to hit multiple CPUs in
parallel. I even tried to play with /kernel/drv/igb.conf and set:
tx_ring_size = 4096;
rx_ring_size = 4096;
rx_group_number = 4;
But nothing helped which leads me to suspect I'm hitting some design
flaw in the way the kernel handles multicast traffic...
I'm hopeful that some networking guru might be lurking on this list,
because I don't know where to turn at this point...
Cheers,
--
Saso
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com