Thanks for the advice Edgar.  This appears to help but does not
eliminate the problem.  This is what I observe (out of maybe 10 trials)
when using '-mca io romio314':

- no failures using 40 processes across 2 nodes (each node has 20 cores)
- no failures if using 'MPI_File_write_at'
- same type of failure if using 2 processes across 2 nodes (i.e. writing
2 bytes) *and* using 'MPI_File_write_at_all'
- writing to the 'sync/hard,intr' file system (see original email) does
not report an error anymore.  I see same results from trials as for an
async mount.

So it mostly works except in an unusual case.  I'd be happy to help test
a nightly snapshot---let me know.

Stephen

On 10/12/2017 07:36 PM, Edgar Gabriel wrote:
> try for now to switch to the romio314 component with OpenMPI. There is
> an issue with NFS and OMPIO that I am aware of and working on, that
> might trigger this behavior (although it should actually work for
> collective I/O even in that case).
>
> try to set something like
>
> mpirun --mca io romio314 ...
>
> Thanks
>
> Edgar
>
>
> On 10/12/2017 8:26 PM, Stephen Guzik wrote:
>> Hi,
>>
>> I'm having trouble with parallel I/O to a file system mounted with NFS
>> over an infiniband network.  In my test code, I'm simply writing 1 byte
>> per process to the same file.  When using two nodes, some bytes are not
>> written (zero bits in the unwritten bytes).  Usually at least some data
>> from each node is written---it appears to be all data from one node and
>> partial from the other.
>>
>> This used to work fine but broke when the cluster was upgraded from
>> Debian 8 to Debian 9.  I suspect an issue with NFS and not with
>> OpenMPI.  However, if anyone can suggest a work-around or ways to get
>> more information, I would appreciate it.  In the sole case where the
>> file system is exported with 'sync' and mounted with 'hard,intr', I get
>> the error:
>> [node1:14823] mca_sharedfp_individual_file_open: Error during datafile
>> file open
>> MPI_ERR_FILE: invalid file
>> [node2:14593] (same)
>>
>> ----------
>>
>> Some additional info:
>> - tested versions 1.8.8, 2.1.1, and 3.0.0 self-compiled and packaged and
>> vendor-supplied versions.  All have same behavior.
>> - all write methods (individual or collective) fail similarly.
>> - exporting the file system to two workstations across ethernet and
>> running the job across the two workstations seems to work fine.
>> - on a single node, everything works as expected in all cases.  In the
>> case described above where I get an error, the error is only observed
>> with processes on two nodes.
>> - code follows.
>>
>> Thanks,
>> Stephen Guzik
>>
>> ----------
>>
>> #include <iostream>
>>
>> #include <mpi.h>
>>
>> int main(int argc, const char* argv[])
>> {
>>    MPI_File fh;
>>    MPI_Status status;
>>
>>    int mpierr;
>>    char mpistr[MPI_MAX_ERROR_STRING];
>>    int mpilen;
>>    int numProc;
>>    int procID;
>>    MPI_Init(&argc, const_cast<char***>(&argv));
>>    MPI_Comm_size(MPI_COMM_WORLD, &numProc);
>>    MPI_Comm_rank(MPI_COMM_WORLD, &procID);
>>
>>    const int filesize = numProc;
>>    const int bufsize = filesize/numProc;
>>    char *buf = new char[bufsize];
>>    buf[0] = (char)(48 + procID);
>>    int numChars = bufsize/sizeof(char);
>>
>>    mpierr = MPI_File_open(MPI_COMM_WORLD, "dataio",
>>                           MPI_MODE_CREATE | MPI_MODE_WRONLY,
>> MPI_INFO_NULL, &fh);
>>    if (mpierr != MPI_SUCCESS)
>>      {
>>        MPI_Error_string(mpierr, mpistr, &mpilen);
>>        std::cout << "Error: " << mpistr << std::endl;
>>      }
>>    mpierr = MPI_File_write_at_all(fh, (MPI_Offset)(procID*bufsize), buf,
>>                                   numChars, MPI_CHAR, &status);
>>    if (mpierr != MPI_SUCCESS)
>>      {
>>        MPI_Error_string(mpierr, mpistr, &mpilen);
>>        std::cout << "Error: " << mpistr << std::endl;
>>      }
>>    mpierr = MPI_File_close(&fh);
>>    if (mpierr != MPI_SUCCESS)
>>      {
>>        MPI_Error_string(mpierr, mpistr, &mpilen);
>>        std::cout << "Error: " << mpistr << std::endl;
>>      }
>>
>>    delete[] buf;
>>    MPI_Finalize();
>>    return 0;
>> }
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to