On Apr 15, 2009, at 5:06 AM, Jovana Knezevic wrote:
Yes, sure, what you say makes sense. On the other hand, it seems I will have to "traditionaly"-open the input file for n times - each one for one process, since anyway all of my processes have to collect their data from it (each parsing it from the beginning to the end), don't you think so? I wanted to take an advantage of MPI to read (in each process) the data from one file... Or have I misunderstood something?
The idea behind MPI I/O is that it can be done in parallel. It usually works best when you have an underlying parallel filesystem. In such cases (typically paired with very large input data), you can exploit the parallelism of the underlying filesystem to efficiently get just the necessary data to each MPI process.
If you input data isn't that large, or if you don't have a parallel filesystem (e.g., you're just using NFS), such schemes can actually be less efficient / slower. It may even be better to have something like MPI_COMM_WORLD rank 0 read in the entire file and MPI_BCAST / MPI_SCATTER / etc. the relevant data to each MPI process as necessary.
It's up to you to decide which is best for your application; it really depends on the requirements of what you are doing, your local setup, etc.
-- Jeff Squyres Cisco Systems