Dear FORUM members, Does anyone use MPIPOSIX driver? We are thinking about retiring the feature in the nearest future (1.8.13?)
Thank you! Elena ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elena Pourmal The HDF Group http://hdfgroup.org 1800 So. Oak St., Suite 203, Champaign IL 61820 217.531.6112 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ On Jan 3, 2014, at 10:50 AM, Mohamad Chaarawi <[email protected]> wrote: > Hi Boyan, > > Thank you for reporting this. I’ll definitely look into it, but wanted to get > a quick reply before doing so. > > You should not use the MPIPOSIX driver (unless there is a very compelling > reason to do so) for parallel I/O and should be using the MPIO driver as you > would get much better performance. I think you are just reporting the make > check failure you were getting here, but just wanted to make sure that you > use the better option. > > Did you see this issue only when using intel compilers? i.e. have you tried > other compilers and saw the same error? > > You are correct about the bug in t_dset.c with comm_size needing to be > comm_rank. Fortunately neither the size and rank are used in this function, > so I will just take them out for the next release. > > Thanks, > Mohamad > > From: Hdf-forum [mailto:[email protected]] On Behalf Of > Boyan Bejanov > Sent: Thursday, January 02, 2014 6:11 PM > To: [email protected] > Cc: Boyan Bejanov > Subject: [Hdf-forum] Problem with MPIPOSIX driver > > Hello, > > I am trying to build parallel hdf5 1.8.12 on RHEL6.4 using the Intel > compilers (C icc and Fortran ifort) and the IntelMPI 4.1.0 (which are > included in the Intel Cluster Studio XE 2013). Everything configured and > compiled okay, although I am getting failures when running “make check”, > specifically in “testpar/testphdf5”. The error message is the following: > > ... > Testing -- test cause for broken collective io (nocolcause) > Testing -- test cause for broken collective io (nocolcause) > Testing -- test cause for broken collective io (nocolcause) > Testing -- test cause for broken collective io (nocolcause) > Testing -- test cause for broken collective io (nocolcause) > Testing -- test cause for broken collective io (nocolcause) > Fatal error in PMPI_Barrier: Invalid communicator, error stack: > PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed > PMPI_Barrier(903): Invalid communicator > Fatal error in PMPI_Barrier: Invalid communicator, error stack: > PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed > PMPI_Barrier(903): Invalid communicator > Fatal error in PMPI_Barrier: Invalid communicator, error stack: > PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed > PMPI_Barrier(903): Invalid communicator > Fatal error in PMPI_Barrier: Invalid communicator, error stack: > PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed > PMPI_Barrier(903): Invalid communicator > Fatal error in PMPI_Barrier: Invalid communicator, error stack: > PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed > PMPI_Barrier(903): Invalid communicator > Fatal error in PMPI_Barrier: Invalid communicator, error stack: > PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed > PMPI_Barrier(903): Invalid communicator > ... > > I have managed to narrow it down – this error is generated when the MPIPOSIX > driver is used (I think it’s in the call to H5Pset_fapl_mpiposix() itself). > The “nocolcause” test is in the function “no_collective_cause_tests” in file > “testpar/t_dest.c”. When I comment out the two lines in this function that > set the “TEST_SET_MPIPOSIX” flag, thus skipping the MPIPOSIX test, everything > else checks successfully. > > I am very new to HDF5, and to parallel io at all, so any help with this issue > will be much appreciated. > Thanks, > Boyan > > PS: While poking about it, I found an inconsequential bug in testpar/t_dest.c > on line 3681 (in the same function): the line is > MPI_Comm_size(MPI_COMM_WORLD, &mpi_rank); > while it should be > MPI_Comm_rank(MPI_COMM_WORLD, &mpi_rank); > > > ==================================================================================== > > La version française suit le texte anglais. > > ------------------------------------------------------------------------------------ > > This email may contain privileged and/or confidential information, and the > Bank of > Canada does not waive any related rights. Any distribution, use, or copying > of this > email or the information it contains by other than the intended recipient is > unauthorized. If you received this email in error please delete it > immediately from > your system and notify the sender promptly by email that you have done so. > > ------------------------------------------------------------------------------------ > > Le présent courriel peut contenir de l'information privilégiée ou > confidentielle. > La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute > diffusion, > utilisation ou copie de ce courriel ou des renseignements qu'il contient par > une > personne autre que le ou les destinataires désignés est interdite. Si vous > recevez > ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans > délai à > l'expéditeur un message électronique pour l'aviser que vous avez éliminé de > votre > ordinateur toute copie du courriel reçu. > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
