We have lustre 1.6.7 configured using 64 OSTs. I am testing the performance using IOR, which is a file system benchmark.
When I run IOR using mpi such that processes write to a shared file, performance does not scale. I tested with 1,2 and 4 processes, and the performance remains constant at 230 MBps. When processes write to separate files, performance improves greatly, reaching 475 MBps. Note that all processes are spawned on a single node. Here is the output: Writing to a shared file: > Command line used: ./IOR -a POSIX -b 2g -e -t 32m -w -o > /fastfs/gabriel/ss_64/km_ior.out > Machine: Linux deimos102 > > Summary: > api = POSIX > test filename = /fastfs/gabriel/ss_64/km_ior.out > access = single-shared-file > ordering in a file = sequential offsets > ordering inter file= no tasks offsets > clients = 4 (4 per node) > repetitions = 1 > xfersize = 32 MiB > blocksize = 2 GiB > aggregate filesize = 8 GiB > > Operation Max (MiB) Min (MiB) Mean (MiB) Std Dev Max (OPs) Min > (OPs) Mean (OPs) Std Dev Mean (s) > --------- --------- --------- ---------- ------- --------- > --------- ---------- ------- -------- > write 233.61 233.61 233.61 0.00 7.30 > 7.30 7.30 0.00 35.06771 EXCEL > > Max Write: 233.61 MiB/sec (244.95 MB/sec) Writing to separate files: > Command line used: ./IOR -a POSIX -b 2g -e -t 32m -w -o > /fastfs/gabriel/ss_64/km_ior.out -F > Machine: Linux deimos102 > > Summary: > api = POSIX > test filename = /fastfs/gabriel/ss_64/km_ior.out > access = file-per-process > ordering in a file = sequential offsets > ordering inter file= no tasks offsets > clients = 4 (4 per node) > repetitions = 1 > xfersize = 32 MiB > blocksize = 2 GiB > aggregate filesize = 8 GiB > > Operation Max (MiB) Min (MiB) Mean (MiB) Std Dev Max (OPs) Min > (OPs) Mean (OPs) Std Dev Mean (s) > --------- --------- --------- ---------- ------- --------- > --------- ---------- ------- -------- > write 475.95 475.95 475.95 0.00 14.87 > 14.87 14.87 0.00 17.21191 EXCEL > > Max Write: 475.95 MiB/sec (499.07 MB/sec) I am trying to understand where the bottleneck is, when processes write to a shared file. Your help is appreciated. -- Kshitij Mehta PhD candidate Parallel Software Technologies Lab (pstl.cs.uh.edu) Dept. of Computer Science University of Houston Houston, Texas, USA _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
