Hi, guys. We're doing research enabling HPC applications on cloud filesystems. 
For comparison purposes, we're using a PVFS2 filesystem as an example of a 
mature HPC filesystem. However, we're getting unusually bad performance from 
our cluster on the MPI applications we're testing writing to PVFS2 through 
mpich2, and want to know if there are any configuration parameters we might be 
forgetting to tweak.

We've tried two applications: LANL's mpi_io_test 
(http://institutes.lanl.gov/data/software/src/mpi-io/README_20.pdf), and the 
IOR benchmark. Both are running through the mpich2 library, compiled with PVFS2 
support. In other words, we're connecting to pvfs2 with mpi rather than through 
a kernel module. We're using version 2.8.1 of PVFS2, configured with a stripe 
size of 1 MB. The mpi_io_test program is writing 1 MB objects.

The cluster has a 1gig E network. We vary the number of servers/clients from 1 
to 40. Every server handles both metadata and data. Clients and servers share 
the same nodes (this makes things more comparable to the typical cloud 
computing situation).

The performance we're seeing is on the order of 15 to 20 MB/s/node. This is 
much lower than the performance we're getting through network tools such as 
crisscross or iperf (which report 45-50 MB/s between most pairs of nodes--still 
too low but more than 20 MB/s).

It doesn't seem to be a disk speed issue: We've tested with the TroveMethod set 
to null-aio and metadata syncing turned off and still get the same results. So 
we believe it might be a networking issue with the way we're sending MPI 
messages. Perhaps they're too small?

Any advice as to parameters we should look into tweaking for MPI to get 
performance closer to that of the network? Any help is appreciated. Thanks.

~Milo and Esteban
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to