I wrote: > The perf test says romio performs a bit better. Also -- from overall > time -- it's faster on IMB-IO (which I haven't looked at in detail, and > ran with suboptimal striping).
I take that back. I can't reproduce a significant difference for total IMB-IO runtime, with both run in parallel on 16 ranks, using either the system default of a single 1MB stripe or using eight stripes. I haven't teased out figures for different operations yet. That must have been done elsewhere, but I've never seen figures.