Afraid I have no idea, but hopefully someone else here with experience with HDF5 can chime in?
On Jan 17, 2014, at 9:03 AM, Ronald Cohen <rhco...@lbl.gov> wrote: > Still a timely response, thank you. The particular problem I noted hasn't > recurred; for reasons I will explain shortly I had to rebuild openmpi again, > and this time Sample_mpio.c compiled and ran successfully from the start. > > But now my problem is trying to get parallel HDF5 to run. In my first > attempt to build HDF5 it failed in the load stage because of unsatisifed > externals from openmpi, and I deduced the problem was having built openmpi > with --disable-static. So I rebuilt with --enable-static and > --disable-dlopen (emulating a successful openmpi + hdf5 combination I had > built on Snow Leopard). Once again openmpi passed its make check's, and as > noted above the Sample_mpio.c test compiled and ran fine. And the parallel > hdf5 configure and make steps ran successfully. But when I ran make check > for hdf5, the serial tests passed but none of the parallel tests did. Over a > million test failures! Error messages like: > > Proc 0: *** MPIO File size range test... > -------------------------------- > MPI_Offset is signed 8 bytes integeral type > MPIO GB file write test MPItest.h5 > MPIO GB file write test MPItest.h5 > MPIO GB file write test MPItest.h5 > MPIO GB file write test MPItest.h5 > MPIO GB file write test MPItest.h5 > MPIO GB file write test MPItest.h5 > MPIO GB file read test MPItest.h5 > MPIO GB file read test MPItest.h5 > MPIO GB file read test MPItest.h5 > MPIO GB file read test MPItest.h5 > proc 3: found data error at [2141192192+0], expect -6, got 5 > proc 3: found data error at [2141192192+1], expect -6, got 5 > > And -- the specific errors reported, which processor, which location, and the > total number of errors changes if I rerun make check. > > I've sent configure, make and make check logs to the HDF5 help desk but > haven't gotten a response. > > I am now configuring openmpi (still 1.7.4rc1) with: > > ./configure --prefix=/usr/local/openmpi CC=gcc CXX=g++ FC=gfortran > F77=gfortran --enable-static --with-pic --disable-dlopen > --enable-mpirun-prefix-by-default > > and configuring HDF5 (version 1.8.12) with: > > configure --prefix=/usr/local/hdf5/par CC=mpicc CFLAGS=-fPIC FC=mpif90 > FCFLAGS=-fPIC CXX=mpicxx CXXFLAGS=-fPIC --enable-parallel --enable-fortran > > This is the combination that worked for me with Snow Leopard (though it was > then earlier versions of both openmpi and hdf5.) > > If it matters, the gcc is the stock one with Mavericks' XCode, and gfortran > is 4.9.0. > > (I just noticed that the mpi fortran wrapper is now mpifort, but I also see > that mpif90 is still there and is a just link to mpifort.) > > Any suggestions? > > > On Fri, Jan 17, 2014 at 8:14 AM, Ralph Castain <r...@open-mpi.org> wrote: > sorry for delayed response - just getting back from travel. I don't know why > you would get that behavior other than a race condition. Afraid that code > path is foreign to me, but perhaps one of the folks in the MPI-IO area can > respond > > > On Jan 15, 2014, at 4:26 PM, Ronald Cohen <rhco...@lbl.gov> wrote: > >> Update: I reconfigured with enable_io_romio=yes, and this time -- mostly -- >> the test using Sample_mpio.c passes. Oddly the very first time I tried I >> got errors: >> >> % mpirun -np 2 sampleio >> Proc 1: hostname=Ron-Cohen-MBP.local >> Testing simple C MPIO program with 2 processes accessing file ./mpitest.data >> (Filename can be specified via program argument) >> Proc 0: hostname=Ron-Cohen-MBP.local >> Proc 1: read data[0:1] got 0, expect 1 >> Proc 1: read data[0:2] got 0, expect 2 >> Proc 1: read data[0:3] got 0, expect 3 >> Proc 1: read data[0:4] got 0, expect 4 >> Proc 1: read data[0:5] got 0, expect 5 >> Proc 1: read data[0:6] got 0, expect 6 >> Proc 1: read data[0:7] got 0, expect 7 >> Proc 1: read data[0:8] got 0, expect 8 >> Proc 1: read data[0:9] got 0, expect 9 >> Proc 1: read data[1:0] got 0, expect 10 >> Proc 1: read data[1:1] got 0, expect 11 >> Proc 1: read data[1:2] got 0, expect 12 >> Proc 1: read data[1:3] got 0, expect 13 >> Proc 1: read data[1:4] got 0, expect 14 >> Proc 1: read data[1:5] got 0, expect 15 >> Proc 1: read data[1:6] got 0, expect 16 >> Proc 1: read data[1:7] got 0, expect 17 >> Proc 1: read data[1:8] got 0, expect 18 >> Proc 1: read data[1:9] got 0, expect 19 >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD >> with errorcode 1. >> >> But when I reran the same mpirun command, the test was successful. And >> deleting the executable and recompiling and then again running the same >> mpirun command, the test was successful. Can someone explain that? >> >> >> >> >> On Wed, Jan 15, 2014 at 1:16 PM, Ronald Cohen <rhco...@lbl.gov> wrote: >> Aha. I guess I didn't know what the io-romio option does. If you look at >> my config.log you will see my configure line included --disable-io-romio. >> Guess I should change --disable to --enable. >> >> You seem to imply that the nightly build is stable enough that I should >> probably switch to that rather than 1.7.4rc1. Am I reading between the >> lines correctly? >> >> >> >> On Wed, Jan 15, 2014 at 10:56 AM, Ralph Castain <r...@open-mpi.org> wrote: >> Oh, a word of caution on those config params - you might need to check to >> ensure I don't disable romio in them. I don't normally build it as I don't >> use it. Since that is what you are trying to use, just change the "no" to >> "yes" (or delete that line altogether) and it will build. >> >> >> >> On Wed, Jan 15, 2014 at 10:53 AM, Ralph Castain <r...@open-mpi.org> wrote: >> You can find my configure options in the OMPI distribution at >> contrib/platform/intel/bend/mac. You are welcome to use them - just >> configure --with-platform=intel/bend/mac >> >> I work on the developer's trunk, of course, but also run the head of the >> 1.7.4 branch (essentially the nightly tarball) on a fairly regular basis. >> >> As for the opal_bitmap test: it wouldn't surprise me if that one was stale. >> I can check on it later tonight, but I'd suspect that the test is bad as we >> use that class in the code base and haven't seen an issue. >> >> >> >> On Wed, Jan 15, 2014 at 10:49 AM, Ronald Cohen <rhco...@lbl.gov> wrote: >> Ralph, >> >> I just sent out another post with the c file attached. >> >> If you can get that to work, and even if you can't can you tell me what >> configure options you use, and what version of open-mpi? Thanks. >> >> Ron >> >> >> On Wed, Jan 15, 2014 at 10:36 AM, Ralph Castain <r...@open-mpi.org> wrote: >> BTW: could you send me your sample test code? >> >> >> On Wed, Jan 15, 2014 at 10:34 AM, Ralph Castain <r...@open-mpi.org> wrote: >> I regularly build on Mavericks and run without problem, though I haven't >> tried a parallel IO app. I'll give yours a try later, when I get back to my >> Mac. >> >> >> >> On Wed, Jan 15, 2014 at 10:04 AM, Ronald Cohen <rhco...@lbl.gov> wrote: >> I have been struggling trying to get a usable build of openmpi on Mac OSX >> Mavericks (10.9.1). I can get openmpi to configure and build without error, >> but have problems after that which depend on the openmpi version. >> >> With 1.6.5, make check fails the opal_datatype_test, ddt_test, and ddt_raw >> tests. The various atomic_* tests pass. See checklogs_1.6.5, attached as >> a .gz file. >> >> Following suggestions from openmpi discussions I tried openmpi version >> 1.7.4rc1. In this case make check indicates all tests passed. But when I >> proceeded to try to build a parallel code (parallel HDF5) it failed. >> Following an email exchange with the HDF5 support people, they suggested I >> try to compile and run the attached bit of simple code Sample_mpio.c (which >> they supplied) which does not use any HDF5, but just attempts a parallel >> write to a file and parallel read. That test failed when requesting more >> than 1 processor -- which they say indicates a failure of the openmpi >> installation. The error message was: >> >> MPI_INIT: argc 1 >> MPI_INIT: argc 1 >> Testing simple C MPIO program with 2 processes accessing file ./mpitest.data >> (Filename can be specified via program argument) >> Proc 0: hostname=Ron-Cohen-MBP.local >> Proc 1: hostname=Ron-Cohen-MBP.local >> MPI_BARRIER[0]: comm MPI_COMM_WORLD >> MPI_BARRIER[1]: comm MPI_COMM_WORLD >> Proc 0: MPI_File_open with MPI_MODE_EXCL failed (MPI_ERR_FILE: invalid file) >> MPI_ABORT[0]: comm MPI_COMM_WORLD errorcode 1 >> MPI_BCAST[1]: buffer 7fff5a483048 count 1 datatype MPI_INT root 0 comm >> MPI_COMM_WORLD >> >> I then went back to my openmpi directories and tried running some of the >> individual tests in the test and examples directories. In particular in >> test/class I found one test that seem to not be run as part of make check >> which failed, even with one processor; this is opal_bitmap. Not sure if >> this is because 1.7.4rc1 is incomplete, or there is something wrong with the >> installation, or maybe a 32 vs 64 bit thing? The error message is >> >> mpirun detected that one or more processes exited with non-zero status, thus >> causing the job to be terminated. The first process to do so was: >> >> Process name: [[48805,1],0] >> Exit code: 255 >> >> Any suggestions? >> >> More generally has anyone out there gotten an openmpi build on Mavericks to >> work with sufficient success that they can get the attached Sample_mpio.c >> (or better yet, parallel HDF5) to build? >> >> Details: Running Mac OS X 10.9.1 on a mid-2009 Macbook pro with 4 GB memory; >> tried openmpi 1.6.5 and 1.7.4rc1. Built openmpi against the stock gcc that >> comes with XCode 5.0.2, and gfortran 4.9.0. >> >> Files attached: config.log.gz, openmpialllog.gz (output of running ompi_info >> --all), checklog2.gz (output of make.check in top openmpi directory). >> >> I am not attaching logs of make and install since those seem to have been >> successful, but can generate those if that would be helpful. >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users