[gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances

Jake Carroll Wed, 09 Nov 2016 09:39:39 -0800

Hi.

I’ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a really 
long distance. About 180ms of latency between the two clusters and around 
13,000km of optical path. Fortunately for me, I’ve actually got near 
theoretical maximum IO over the NIC’s between the clusters and I’m iPerf’ing at 
around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 all the way 
through.


Anyway – I’m finding my AFM traffic to be dragging its feet and I don’t really 
understand why that might be. I’ve verified the links and transports ability as 
I said above with iPerf, and CERN’s FDT to near 10Gbit/sec.

I also verified the clusters on both sides in terms of disk IO and they both 
seem easily capable in IOZone and IOR tests of multiple GB/sec of throughput.

So – my questions:


1.       Are there very specific tunings AFM needs for high latency/long 
distance IO?

2.       Are there very specific NIC/TCP-stack tunings (beyond the type of 
thing we already have in place) that benefits AFM over really long distances 
and high latency?

3.       We are seeing on the “cache” side really lazy/sticky “ls –als” in the 
home mount. It sometimes takes 20 to 30 seconds before the command line will 
report back with a long listing of files. Any ideas why it’d take that long to 
get a response from “home”.

We’ve got our TCP stack setup fairly aggressively, on all hosts that 
participate in these two clusters.

ethtool -C enp2s0f0 adaptive-rx off
ifconfig enp2s0f0 txqueuelen 10000
sysctl -w net.core.rmem_max=536870912
sysctl -w net.core.wmem_max=536870912
sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456"
sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456"
sysctl -w net.core.netdev_max_backlog=250000
sysctl -w net.ipv4.tcp_congestion_control=htcp
sysctl -w net.ipv4.tcp_mtu_probing=1

I modified a couple of small things on the AFM “cache” side to see if it’d make 
a difference such as:

mmchconfig afmNumWriteThreads=4
mmchconfig afmNumReadThreads=4

But no difference so far.

Thoughts would be appreciated. I’ve done this before over much shorter 
distances (30Km) and I’ve flattened a 10GbE wire without really 
tuning…anything. Are my large in-flight-packets 
numbers/long-time-to-acknowledgement semantics going to hurt here? I really 
thought AFM might be well designed for exactly this kind of work at long 
distance *and* high throughput – so I must be missing something!

-jc

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

[gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances

Reply via email to