On Wed, 25 Nov 2020, Dave Love via users wrote:

The perf test says romio performs a bit better.  Also -- from overall
time -- it's faster on IMB-IO (which I haven't looked at in detail, and
ran with suboptimal striping).

I take that back.  I can't reproduce a significant difference for total
IMB-IO runtime, with both run in parallel on 16 ranks, using either the
system default of a single 1MB stripe or using eight stripes.  I haven't
teased out figures for different operations yet.  That must have been
done elsewhere, but I've never seen figures.

But remember that IMB-IO doesn't cover everything. For example, hdf5's t_bigio parallel test appears to be a pathological case and OMPIO is 2 orders of magnitude slower on a Lustre filesystem:

- OMPI's default MPI-IO implementation on Lustre (ROMIO): 21 seconds
- OMPI's alternative MPI-IO implementation on Lustre (OMPIO): 2554 seconds

End users seem to have the choice of:

- use openmpi 4.x and have some things broken (romio)
- use openmpi 4.x and have some things slow (ompio)
- use openmpi 3.x and everything works

My concern is that openmpi 3.x is near, or at, end of life.

Mark


t_bigio runs on centos 7, gcc 4.8.5, ppc64le, openmpi 4.0.5, hdf5 1.10.7, 
Lustre 2.12.5:

[login testpar]$ time mpirun -np 6 ./t_bigio

Testing Dataset1 write by ROW

Testing Dataset2 write by COL

Testing Dataset3 write select ALL proc 0, NONE others

Testing Dataset4 write point selection

Read Testing Dataset1 by COL

Read Testing Dataset2 by ROW

Read Testing Dataset3 read select ALL proc 0, NONE others

Read Testing Dataset4 with Point selection
***Express test mode on.  Several tests are skipped

real    0m21.141s
user    2m0.318s
sys     0m3.289s


[login testpar]$ export OMPI_MCA_io=ompio
[login testpar]$ time mpirun -np 6 ./t_bigio

Testing Dataset1 write by ROW

Testing Dataset2 write by COL

Testing Dataset3 write select ALL proc 0, NONE others

Testing Dataset4 write point selection

Read Testing Dataset1 by COL

Read Testing Dataset2 by ROW

Read Testing Dataset3 read select ALL proc 0, NONE others

Read Testing Dataset4 with Point selection
***Express test mode on.  Several tests are skipped

real    42m34.103s
user    213m22.925s
sys     8m6.742s

Reply via email to