Hi Boyan, Thank you for reporting this. I’ll definitely look into it, but wanted to get a quick reply before doing so.
You should not use the MPIPOSIX driver (unless there is a very compelling reason to do so) for parallel I/O and should be using the MPIO driver as you would get much better performance. I think you are just reporting the make check failure you were getting here, but just wanted to make sure that you use the better option. Did you see this issue only when using intel compilers? i.e. have you tried other compilers and saw the same error? You are correct about the bug in t_dset.c with comm_size needing to be comm_rank. Fortunately neither the size and rank are used in this function, so I will just take them out for the next release. Thanks, Mohamad From: Hdf-forum [mailto:[email protected]] On Behalf Of Boyan Bejanov Sent: Thursday, January 02, 2014 6:11 PM To: [email protected] Cc: Boyan Bejanov Subject: [Hdf-forum] Problem with MPIPOSIX driver Hello, I am trying to build parallel hdf5 1.8.12 on RHEL6.4 using the Intel compilers (C icc and Fortran ifort) and the IntelMPI 4.1.0 (which are included in the Intel Cluster Studio XE 2013). Everything configured and compiled okay, although I am getting failures when running “make check”, specifically in “testpar/testphdf5”. The error message is the following: ... Testing -- test cause for broken collective io (nocolcause) Testing -- test cause for broken collective io (nocolcause) Testing -- test cause for broken collective io (nocolcause) Testing -- test cause for broken collective io (nocolcause) Testing -- test cause for broken collective io (nocolcause) Testing -- test cause for broken collective io (nocolcause) Fatal error in PMPI_Barrier: Invalid communicator, error stack: PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed PMPI_Barrier(903): Invalid communicator Fatal error in PMPI_Barrier: Invalid communicator, error stack: PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed PMPI_Barrier(903): Invalid communicator Fatal error in PMPI_Barrier: Invalid communicator, error stack: PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed PMPI_Barrier(903): Invalid communicator Fatal error in PMPI_Barrier: Invalid communicator, error stack: PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed PMPI_Barrier(903): Invalid communicator Fatal error in PMPI_Barrier: Invalid communicator, error stack: PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed PMPI_Barrier(903): Invalid communicator Fatal error in PMPI_Barrier: Invalid communicator, error stack: PMPI_Barrier(949): MPI_Barrier(comm=0x0) failed PMPI_Barrier(903): Invalid communicator ... I have managed to narrow it down – this error is generated when the MPIPOSIX driver is used (I think it’s in the call to H5Pset_fapl_mpiposix() itself). The “nocolcause” test is in the function “no_collective_cause_tests” in file “testpar/t_dest.c”. When I comment out the two lines in this function that set the “TEST_SET_MPIPOSIX” flag, thus skipping the MPIPOSIX test, everything else checks successfully. I am very new to HDF5, and to parallel io at all, so any help with this issue will be much appreciated. Thanks, Boyan PS: While poking about it, I found an inconsequential bug in testpar/t_dest.c on line 3681 (in the same function): the line is MPI_Comm_size(MPI_COMM_WORLD, &mpi_rank); while it should be MPI_Comm_rank(MPI_COMM_WORLD, &mpi_rank); ==================================================================================== La version française suit le texte anglais. ------------------------------------------------------------------------------------ This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any distribution, use, or copying of this email or the information it contains by other than the intended recipient is unauthorized. If you received this email in error please delete it immediately from your system and notify the sender promptly by email that you have done so. ------------------------------------------------------------------------------------ Le présent courriel peut contenir de l'information privilégiée ou confidentielle. La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements qu'il contient par une personne autre que le ou les destinataires désignés est interdite. Si vous recevez ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans délai à l'expéditeur un message électronique pour l'aviser que vous avez éliminé de votre ordinateur toute copie du courriel reçu.
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
