Hello,

We use AFM prefetch to migrate data between two clusters (using NFS). This 
works fine with large files, say 1+GB. But we have millions of smaller files,  
about 1MB each. Here I see just ~150MB/s – compare this to the 1000+MB/s we get 
for larger files.

I assume that we would need more parallelism, does prefetch pull just one file 
at a time? So each file needs  some or many metadata operations plus a single  
or just a few read and writes. Doing this sequentially adds up all the 
latencies of NFS+GPFS. This is my explanation. With larger files gpfs prefetch 
on home will help.

Please can anybody comment: Is this right, does AFM prefetch handle one file at 
a time in a sequential manner? And is there any way to change this behavior? Or 
am I wrong and I need to look elsewhere to get better performance for prefetch 
of many smaller files?

We will migrate several filesets in parallel, but still with individual 
filesets up to 350TB in size 150MB/s isn’t fun. Also just about 150 files/s 
seconds looks poor.

The setup is quite new, hence there may be other places to look at. 
It’s all RHEL7 an spectrum scale 4.2.2-3 on the afm cache.

Thank you,

Heiner
--,
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232  Villigen PSI
056 310 36 02
https://www.psi.ch
 
    

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to