It didn’t work any better with XFS, as it happens. Must be something else. I’m going to test some more and see if I can narrow it down any, as it seems to me that it did work with a different compiler.
-- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' > On Feb 18, 2019, at 12:23 PM, Gabriel, Edgar <egabr...@central.uh.edu> wrote: > > While I was working on something else, I let the tests run with Open MPI > master (which is for parallel I/O equivalent to the upcoming v4.0.1 > release), and here is what I found for the HDF5 1.10.4 tests on my local > desktop: > > In the testpar directory, there is in fact one test that fails for both ompio > and romio321 in exactly the same manner. > I used 6 processes as you did (although I used mpirun directly instead of > srun...) From the 13 tests in the testpar directory, 12 pass correctly > (t_bigio, t_cache, t_cache_image, testphdf5, t_filters_parallel, t_init_term, > t_mpi, t_pflush2, t_pread, t_prestart, t_pshutdown, t_shapesame). > > The one tests that officially fails ( t_pflush1) actually reports that it > passed, but then throws message that indicates that MPI_Abort has been > called, for both ompio and romio. I will try to investigate this test to see > what is going on. > > That being said, your report shows an issue in t_mpi, which passes without > problems for me. This is however not GPFS, this was an XFS local file system. > Running the tests on GPFS are on my todo list as well. > > Thanks > Edgar > > > >> -----Original Message----- >> From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of >> Gabriel, Edgar >> Sent: Sunday, February 17, 2019 10:34 AM >> To: Open MPI Users <users@lists.open-mpi.org> >> Subject: Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI >> 3.1.3 >> >> I will also run our testsuite and the HDF5 testsuite on GPFS, I have access >> to a >> GPFS file system since recently, and will report back on that, but it will >> take a >> few days. >> >> Thanks >> Edgar >> >>> -----Original Message----- >>> From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of >>> Ryan Novosielski >>> Sent: Sunday, February 17, 2019 2:37 AM >>> To: users@lists.open-mpi.org >>> Subject: Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI >>> 3.1.3 >>> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> This is on GPFS. I'll try it on XFS to see if it makes any difference. >>> >>> On 2/16/19 11:57 PM, Gilles Gouaillardet wrote: >>>> Ryan, >>>> >>>> What filesystem are you running on ? >>>> >>>> Open MPI defaults to the ompio component, except on Lustre >>>> filesystem where ROMIO is used. (if the issue is related to ROMIO, >>>> that can explain why you did not see any difference, in that case, >>>> you might want to try an other filesystem (local filesystem or NFS >>>> for example)\ >>>> >>>> >>>> Cheers, >>>> >>>> Gilles >>>> >>>> On Sun, Feb 17, 2019 at 3:08 AM Ryan Novosielski >>>> <novos...@rutgers.edu> wrote: >>>>> >>>>> I verified that it makes it through to a bash prompt, but I’m a >>>>> little less confident that something make test does doesn’t clear it. >>>>> Any recommendation for a way to verify? >>>>> >>>>> In any case, no change, unfortunately. >>>>> >>>>> Sent from my iPhone >>>>> >>>>>> On Feb 16, 2019, at 08:13, Gabriel, Edgar >>>>>> <egabr...@central.uh.edu> >>>>>> wrote: >>>>>> >>>>>> What file system are you running on? >>>>>> >>>>>> I will look into this, but it might be later next week. I just >>>>>> wanted to emphasize that we are regularly running the parallel >>>>>> hdf5 tests with ompio, and I am not aware of any outstanding items >>>>>> that do not work (and are supposed to work). That being said, I >>>>>> run the tests manually, and not the 'make test' >>>>>> commands. Will have to check which tests are being run by that. >>>>>> >>>>>> Edgar >>>>>> >>>>>>> -----Original Message----- From: users >>>>>>> [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gilles >>>>>>> Gouaillardet Sent: Saturday, February 16, 2019 1:49 AM To: Open >>>>>>> MPI Users <users@lists.open-mpi.org> Subject: Re: >>>>>>> [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI >>>>>>> 3.1.3 >>>>>>> >>>>>>> Ryan, >>>>>>> >>>>>>> Can you >>>>>>> >>>>>>> export OMPI_MCA_io=^ompio >>>>>>> >>>>>>> and try again after you made sure this environment variable is >>>>>>> passed by srun to the MPI tasks ? >>>>>>> >>>>>>> We have identified and fixed several issues specific to the >>>>>>> (default) ompio component, so that could be a valid workaround >>>>>>> until the next release. >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Gilles >>>>>>> >>>>>>> Ryan Novosielski <novos...@rutgers.edu> wrote: >>>>>>>> Hi there, >>>>>>>> >>>>>>>> Honestly don’t know which piece of this puzzle to look at or how >>>>>>>> to get more >>>>>>> information for troubleshooting. I successfully built HDF5 >>>>>>> 1.10.4 with RHEL system GCC 4.8.5 and OpenMPI 3.1.3. Running the >>>>>>> “make check” in HDF5 is failing at the below point; I am using a >>>>>>> value of RUNPARALLEL='srun -- mpi=pmi2 -p main -t >>>>>>> 1:00:00 -n6 -N1’ and have a SLURM that’s otherwise properly >>>>>>> configured. >>>>>>>> >>>>>>>> Thanks for any help you can provide. >>>>>>>> >>>>>>>> make[4]: Entering directory >>>>>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build- >>>>>>> gcc-4.8-openmpi-3.1.3/testpar' >>>>>>>> ============================ Testing t_mpi >>>>>>>> ============================ t_mpi Test Log >>>>>>>> ============================ srun: job 84126610 queued and >>> waiting >>>>>>>> for resources srun: job 84126610 has been allocated resources >>>>>>>> srun: error: slepner023: tasks 0-5: Alarm clock 0.01user >>>>>>>> 0.00system 20:03.95elapsed 0%CPU (0avgtext+0avgdata >>>>>>>> 5152maxresident)k 0inputs+0outputs >> (0major+1529minor)pagefaults >>>>>>>> 0swaps make[4]: *** [t_mpi.chkexe_] Error 1 make[4]: Leaving >>>>>>>> directory >>>>>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build- >>>>>>> gcc-4.8-openmpi-3.1.3/testpar' >>>>>>>> make[3]: *** [build-check-p] Error 1 make[3]: Leaving directory >>>>>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build- >>>>>>> gcc-4.8-openmpi-3.1.3/testpar' >>>>>>>> make[2]: *** [test] Error 2 make[2]: Leaving directory >>>>>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build- >>>>>>> gcc-4.8-openmpi-3.1.3/testpar' >>>>>>>> make[1]: *** [check-am] Error 2 make[1]: Leaving directory >>>>>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build- >>>>>>> gcc-4.8-openmpi-3.1.3/testpar' >>>>>>>> make: *** [check-recursive] Error 1 >>>>>>>> >>>>>>>> -- ____ || \\UTGERS, >>>>>>>> |---------------------------*O*--------------------------- >>>>>>>> ||_// the State | Ryan Novosielski - >>>>>>>> novos...@rutgers.edu || \\ University | Sr. Technologist - >>>>>>>> 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | >>>>>>>> Office of Advanced Research Computing - MSB C630, Newark `' >>>>>>> _______________________________________________ users >> mailing >>> list >>>>>>> users@lists.open-mpi.org >>>>>>> https://lists.open-mpi.org/mailman/listinfo/users >>>>>> _______________________________________________ users mailing >>> list >>>>>> users@lists.open-mpi.org >>>>>> https://lists.open-mpi.org/mailman/listinfo/users >>>>> _______________________________________________ users mailing >> list >>>>> users@lists.open-mpi.org >>>>> https://lists.open-mpi.org/mailman/listinfo/users >>>> _______________________________________________ users mailing list >>>> users@lists.open-mpi.org >>>> https://lists.open-mpi.org/mailman/listinfo/users >>>> >>> >>> - -- >>> ____ >>> || \\UTGERS, |----------------------*O*------------------------ >>> ||_// the State | Ryan Novosielski - novos...@rutgers.edu >>> || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus >>> || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark >>> `' >>> -----BEGIN PGP SIGNATURE----- >>> >>> >> iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXGkdJQAKCRCZv6Bp >>> 0Ryx >>> >> vvO3AKChC0/SZ74xeY95WjYEgFhVz+bXlACfYZWEKe4ZDbbbafGAcCuMF04yIgs >>> = >>> =6QM1 >>> -----END PGP SIGNATURE----- >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org >>> https://lists.open-mpi.org/mailman/listinfo/users >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users