Dear Spectrum Scale mailing list,

I'm part of IBM Lab Services - currently i'm having multiple customers
asking me for optimization of a similar workloads.

The task is to tune a Spectrum Scale system (comprising ESS and CES
protocol nodes) for the following workload:
        A single Linux NFS client mounts an NFS export, extracts a flat tar
archive with lots of ~5KB files.
        I'm measuring the speed at which those 5KB files are written (`time
tar xf archive.tar`).

I do understand that Spectrum Scale is not designed for such workload
(single client, single thread, small files, single directory), and that
such benchmark in not appropriate to benmark the system.
Yet I find myself explaining the performance for such scenario (git
clone..) quite frequently, as customers insist that optimization of that
scenario would impact individual users as it shows task duration.
I want to make sure that I have optimized the system as much as possible
for the given workload, and that I have not overlooked something obvious.


When writing to GPFS directly I'm able to write ~1800 files / second in a
test setup.
This is roughly the same on the protocol nodes (NSD client), as well as on
the ESS IO nodes (NSD server).
When writing to the NFS export on the protocol node itself (to avoid any
network effects) I'm only able to write ~230 files / second.
Writing to the NFS export from another node (now including network latency)
gives me ~220 files / second.


There seems to be a huge performance degradation by adding NFS-Ganesha to
the software stack alone. I wonder what can be done to minimize the impact.


- Ganesha doesn't seem to support 'async' or 'no_wdelay' options...
anything equivalent available?
- Is there and expected advantage of using the network-latency tuned
profile, as opposed to the ESS default throughput-performance profile?
- Are there other relevant Kernel params?
- Is there an expected advantage of raising the number of threads (NSD
server (nsd*WorkerThreads) / NSD client (workerThreads) / Ganesha
(NB_WORKER)) for the given workload (single client, single thread, small
files)?
- Are there other relevant GPFS params?
- Impact of Sync replication, disk latency, etc is understood.
- I'm aware that 'the real thing' would be to work with larger files in a
multithreaded manner from multiple nodes - and that this scenario will
scale quite well.
  I just want to ensure that I'm not missing something obvious over
reiterating that massage to customers.

Any help was greatly appreciated - thanks much in advance!
Alexander Saupp
IBM Germany


Mit freundlichen Grüßen / Kind regards

Alexander Saupp

IBM Systems, Storage Platform, EMEA Storage Competence Center
                                                                                
                              
                                                                                
                              
                                                                                
                              
                                                                                
                              
                                                                                
                              
 Phone:            +49 7034-643-1512                         IBM Deutschland 
GmbH                             
                                                                                
                              
 Mobile:           +49-172 7251072                           Am Weiher 24       
                              
                                                                                
                              
 Email:            [email protected]                65451 Kelsterbach  
                              
                                                                                
                              
                                                             Germany            
                              
                                                                                
                              
                                                                                
                              
                                                                                
                              
 IBM Deutschland                                                                
                              
 GmbH /                                                                         
                              
 Vorsitzender des                                                               
                              
 Aufsichtsrats:                                                                 
                              
 Martin Jetter                                                                  
                              
 Geschäftsführung:                                                              
                              
 Matthias Hartmann                                                              
                              
 (Vorsitzender),                                                                
                              
 Norbert Janzen,                                                                
                              
 Stefan Lutz,                                                                   
                              
 Nicole Reimer,                                                                 
                              
 Dr. Klaus                                                                      
                              
 Seifert, Wolfgang                                                              
                              
 Wendt                                                                          
                              
 Sitz der                                                                       
                              
 Gesellschaft:                                                                  
                              
 Ehningen /                                                                     
                              
 Registergericht:                                                               
                              
 Amtsgericht                                                                    
                              
 Stuttgart, HRB                                                                 
                              
 14562 /                                                                        
                              
 WEEE-Reg.-Nr. DE                                                               
                              
 99369940                                                                       
                              
                                                                                
                              

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to