"Kong, Fande" <[email protected]> writes: >> Given the way I/O is structured on big machines, we believe the multiple >> file route is a huge mistake. Also, all our measurements >> say that sending some data on the network is not noticeable given the disk >> access costs. >> > > I have slightly different things here. We tried the serial output, it looks > really slow for large-scale problems, and the first processor often runs > out of memory because of gathering all data from other processor cores. The > parallel IO runs smoothly and much faster than I excepted. We have done > experiments with ten thousands of cores for a problem with 1 billion of > unknowns. I did not see any concern so far.
I think there are two different issues here. Writing a separate file per MPI rank (often also per time step, etc.) creates a filesystem *metadata* bottleneck. It's the open() and close() that are more painful than the write() when you have lots of files. (You'd also want to be careful about your naming convention because merely running "ls" on a directory with many files is usually quite painful.) MPI-IO collectives offer a solution -- each rank writes parts of a file efficiently using the parallel file system. MPI-IO was introduced in MPI-2 (standardized in 1997) and PETSc has thus far avoided a hard dependency on this standard because some implementations were very slow to adopt it. In my opinion, any IO in PETSc that is intended to be highly scalable should use MPI-IO.
signature.asc
Description: PGP signature
