Thanks Robert, that explains it - frustrating to have spent the last few days trying to isolate the issue, but at least it has a good explanation that is fixable.
My hand is up to volunteer testing the update :) Denis On 10 November 2017 at 14:49, Robert Mustacchi <[email protected]> wrote: > Hi Denis, > > The fundamental issue right now is that in certain VLAN configurations > the system is not taking advantage of hardware polling as it should be. > This means that the system falls back to high watermarks for packet > rates that are a bit low and end up netting the performance around > roughly what you're seeing for a single thread. > > As you might imagine, we're acutely aware of this problem and are in the > process of implementing RFD 97 > (https://github.com/joyent/rfd/tree/master/rfd/0097) to address it, > which is already seeing promising results from our current experiments. > > Robert > > On 11/9/17 19:35 , Denis Cheong wrote: > > I am adding 10GbE to my existing SmartOS server but am experiencing > unusual and severe performance issues that I’m at a loss to explain. > > > > Over the default untagged 10GbE link, I can get >9Gbit/sec consistently > under all configurations. > > As soon as I test over a VLAN, transfer rates plummet to a very > inconsistent 3-4Gbit/sec RX, and <1Gbit/sec TX. > > > > Does anybody have any ideas what might be going on here? > > > > Performance over default VLAN ID (SmartOS is running iperf3 -s; nb with > SmartOS as client and other host as server, performance is identical): > > > > Connecting to host 192.168.245.14, port 5201 > > local 192.168.245.21 port 56809 connected to 192.168.245.14 port > 5201 > > Interval Transfer Bandwidth > > 0.00-1.00 sec 1.12 GBytes 9.58 Gbits/sec > > 1.00-2.00 sec 1.12 GBytes 9.62 Gbits/sec > > 2.00-3.00 sec 1.12 GBytes 9.62 Gbits/sec > > 3.00-4.00 sec 1.12 GBytes 9.60 Gbits/sec > > 4.00-5.00 sec 1.12 GBytes 9.59 Gbits/sec > > 5.00-6.00 sec 1.12 GBytes 9.61 Gbits/sec > > 6.00-7.00 sec 1.12 GBytes 9.59 Gbits/sec > > 7.00-8.00 sec 1.10 GBytes 9.47 Gbits/sec > > 8.00-9.00 sec 1.12 GBytes 9.60 Gbits/sec > > 9.00-10.00 sec 1.12 GBytes 9.63 Gbits/sec > > - - - - - - - - - - - - - - - - - - - - - - - - > > Interval Transfer Bandwidth > > 0.00-10.00 sec 11.2 GBytes 9.59 Gbits/sec > sender > > 0.00-10.00 sec 11.2 GBytes 9.59 Gbits/sec > receive > > > > Performance over the same link, but over VLAN 300 (SmartOS is running > iperf3 -s; note wild variation from 2 - 5Gbit/sec): > > > > Connecting to host 192.168.245.134, port 5201 > > local 192.168.245.133 port 56786 connected to 192.168.245.134 port > 5201 > > Interval Transfer Bandwidth > > 0.00-1.00 sec 523 MBytes 4.39 Gbits/sec > > 1.00-2.00 sec 481 MBytes 4.04 Gbits/sec > > 2.00-3.00 sec 608 MBytes 5.10 Gbits/sec > > 3.00-4.00 sec 560 MBytes 4.70 Gbits/sec > > 4.00-5.00 sec 242 MBytes 2.03 Gbits/sec > > 5.00-6.00 sec 592 MBytes 4.96 Gbits/sec > > 6.00-7.00 sec 553 MBytes 4.64 Gbits/sec > > 7.00-8.00 sec 253 MBytes 2.12 Gbits/sec > > 8.00-9.00 sec 569 MBytes 4.77 Gbits/sec > > 9.00-10.00 sec 507 MBytes 4.25 Gbits/sec > > - - - - - - - - - - - - - - - - - - - - - - - - > > Interval Transfer Bandwidth > > 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec > sender > > 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec > receiver > > > > Performance over the same link, VLAN 300, SmartOS as client, server on > other host (note significantly worse performance on transmit): > > > > Connecting to host 192.168.245.133, port 5201 > > local 192.168.245.134 port 35851 connected to 192.168.245.133 port > 5201 > > Interval Transfer Bandwidth > > 0.00-1.00 sec 104 MBytes 875 Mbits/sec > > 1.00-2.00 sec 46.3 MBytes 389 Mbits/sec > > 2.00-3.00 sec 130 MBytes 1.09 Gbits/sec > > 3.00-4.00 sec 76.0 MBytes 638 Mbits/sec > > 4.00-5.00 sec 97.0 MBytes 814 Mbits/sec > > 5.00-6.00 sec 17.4 MBytes 146 Mbits/sec > > 6.00-7.00 sec 67.6 MBytes 567 Mbits/sec > > 7.00-8.00 sec 92.4 MBytes 775 Mbits/sec > > 8.00-9.00 sec 79.7 MBytes 669 Mbits/sec > > 9.00-10.00 sec 73.3 MBytes 615 Mbits/sec > > - - - - - - - - - - - - - - - - - - - - - - - - > > Interval Transfer Bandwidth > > 0.00-10.00 sec 785 MBytes 658 Mbits/sec > sender > > 0.00-10.00 sec 784 MBytes 658 Mbits/sec > receiver > > > > Other observations: > > Multiple parallel TCP transfers (-P 6) make no difference, aggregate > transfer rate is identical > > Initial real-world testing copying 6GB file to SmartOS confirmed > transfer rate at about 1Gbit/s (also observed through switch traffic > monitoring at no more than 1Gbit/s) > > UDP tests with iperf3 seem highly problematic (VLAN or not) - but not > sure if this is iperf3 issue or not (I suspect iperf3 given that TCP > bandwidth tests seem fine): > > Packets received out of order on SmartOS > > Extreme packet loss at anything >5M (yes 5mbit) bandwidth, can go up to > 99% > > iperf3 in a zone vs gz makes no difference > > Eliminating the switch (Mikrotik CRS326-24G-2S+) and running direct > fiber between the two machines makes no difference. > > MTU is 9000 (switch max MTU is 9500). Other endpoint is 9000. > > > > Relevant system configuration: > > Intel S2600CO motherboard, 2 x Xeon E5-2670 CPUs > > Intel 10GbE dual-SFP+ NIC (I have another HP 10GbE card that I was > originally using that exhibited the same initial performance issues - but > its max MTU in SmartOS is 1500 - hence switched to the Intel card) > > Original PI 20170928T144204Z. Upgraded to PI 20171109T032417 during > testing with no improvement > > > > ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
