Hi Robert,

I’m communication to the card at the global zone level - no VNICs involved. At 
the moment I’m focused on that performance since netperf is only showing 
5.5Gbs. The iSCSI throughput is less, but it could be that it’s constrained 
more by total NIC performance right now than anything inherent in the software 
stack. That said, if anybody had something simple to try, I’m certainly willing 
to do so.

Never used dtrace before, so I’ll have to dig around to figure out what/how to 
measure. It’s probably strange to hear that somebody on this list doesn’t use 
dtrace, but I’m primarily a Linux/Mac guy. However, the combination of features 
present in SmartOS was just too great to pass up, which is why I’ve been using 
it since OpenSolaris went belly up. However, it also means a steeper ramp up 
time for me to get things done since there are so few issues that I ever run 
into with it, and it does almost everything that I want so well. Rough problem 
to have, huh? :)

I’m also getting a new Intel X520-SR2 sometime today, so I’ll be able to plug 
it into the SmartOS server to see if it makes a difference in performance. 
Results from that might add some clarity to the situation.

Regards,
John

> On Aug 14, 2017, at 2:10 PM, Robert Mustacchi <[email protected]> wrote:
> 
> On 8/13/17 20:24 , John Croix wrote:
>> Just wanted to check in to see if anybody had any recommendations for tuning 
>> parameters for 10GbE performance or iSCSI.
>> 
>> I have 2 Myricom 10GbE adapters direct connected to one another. On the 
>> SmartOS side, I have an iSCSI ZFS volume set up. On the Mac side, I’m using 
>> GlobalSAN to attach to the SmartOS volume. Jumbo frames (MTU=9000) are 
>> enabled on both sides. I’ve benchmarked my performance using netperf, and 
>> I’m seeing the following (executed from the Mac side, netperf server running 
>> on SmartOS):
>> 
>> # netperf -H 192.168.2.2 -t TCP_STREAM -C -c -l 60  -- -s 512K -S 512K
>> MIGRATED TCP STREAM TEST from (null) (0.0.0.0) port 0 AF_INET to (null) () 
>> port 0 AF_INET
>> Recv   Send    Send                          Utilization       Service Demand
>> Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
>> Size   Size    Size     Time     Throughput  local    remote   local   remote
>> bytes  bytes   bytes    secs.    10^6bits/s  % O      % ?      us/KB   us/KB
>> 
>> 524744 524288 524288    60.00      5512.56   6.35     10.42    1.132   
>> -0.310 
>> 
>> I’ve also created a very large file of 0’s, and am using “dd” to copy them 
>> over. Here’s what I’m seeing when running that on the Mac:
>> 
>> # time dd if=junk.zero of=/Volumes/remote/junk.zero bs=1048576
>> 43158+1 records in
>> 43158+1 records out
>> 45254967296 bytes transferred in 92.342684 secs (490076369 bytes/sec)
>> 
>> real 1m32.382s
>> user 0m0.041s
>> sys  0m30.624s
>> 
>> I’ve followed a few tuning guides on the Mac, which actually brought the 
>> numbers up to the levels I’m showing here. I’m now looking for things on the 
>> SmartOS side that can help. BTW, I did try the suggestions here 
>> (https://community.emc.com/docs/DOC-39156 
>> <https://community.emc.com/docs/DOC-39156> 
>> <https://community.emc.com/docs/DOC-39156 
>> <https://community.emc.com/docs/DOC-39156>>), but changing those properties 
>> didn’t seem to make a difference to any of my numbers.
>> 
>> According to my Mac, I have good throughput to the Myricom card itself (a 
>> value of 1280 corresponds to 10Gb/sec), so I don’t think that there’s an 
>> issue between the Mac and my ethernet card. The card is in a Mercury Helios 
>> external cage, connected via Thunderbolt 2 (20Gbs top speed). The card is a 
>> Myricom 10G-PCIE2-8B2-2S NIC.
>> 
>> # sysctl net.myri10ge | grep dma
>> net.myri10ge.en13.dma_read_bw_MBs: 1436
>> net.myri10ge.en13.dma_write_bw_MBs: 1456
>> net.myri10ge.en13.dma_read_write_bw_MBs: 2610
>> net.myri10ge.en12.dma_read_bw_MBs: 1436
>> net.myri10ge.en12.dma_write_bw_MBs: 1456
>> net.myri10ge.en12.dma_read_write_bw_MBs: 2610
>> 
>> Finally, the SmartOS system itself is a SuperMicro X8DTE-F running 2 Xeon 
>> L5630’s (16 cores total) with 96GB of ECC memory and a three 3TB hard 
>> drives, in a 3-way mirror, that the iSCSI volume is on. Synchronization is 
>> disabled:
>> 
>> [root@smartos ~]# zfs get sync zones/zpool/iscsi-1
>> NAME                 PROPERTY  VALUE     SOURCE
>> zones/zpool/iscsi-1  sync      disabled  local
>> 
>> Sorry for the long post, but trying to supply any pertinent information 
>> without people having to ask for it. Any help in boosting these numbers 
>> would be appreciated.
> 
> In terms of investigating this, I have a couple of different questions.
> I guess, in general, I'd first focus on understanding the upper bound.
> When you're doing the streaming TCP tests are those going to a VNIC, to
> an interface that's been plumbed up in the GZ? Something else? In
> general, we haven't seen much tuning across Intel or other vendors cards
> in terms of driving 10 GbE perf. That said, VLANs can ultimately limit
> perf among some other factors.
> 
> I'm not sure if this is that helpful, but hopefully helps to start give
> some place to go look at. If you're instead focused on iSCSI, I'd start
> characterizing the latency of operations with DTrace by op type so we
> can get a better understanding of the overall system perf.
> 
> Robert
> 



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to