For giggles, try using MPI_STATUS_IGNORE (assuming you don't need to look at the status at all). See if that works for you.
Meaning: I wonder if we're computing the status size for Fortran incorrectly in the -i8 case... On Oct 31, 2013, at 1:58 PM, Jim Parker <jimparker96...@gmail.com> wrote: > Some additional info that may jog some solutions. Calls to MPI_SEND do not > cause memory corruption. Only calls to MPI_RECV. Since the main difference > is the fact that MPI_RECV needs a "status" array and SEND does not, seems to > indicate to me that something is wrong with status. > > Also, I can run a C version of the helloWorld program with no errors. > However, int types are only 4-byte. To send 8byte integers, I define tempInt > as long int and pass MPI_LONG as a type. > > @Jeff, > I got a copy of the openmpi conf.log. See attached. > > Cheers, > --Jim > > On Wed, Oct 30, 2013 at 10:55 PM, Jim Parker <jimparker96...@gmail.com> wrote: > Ok, all, where to begin... > > Perhaps I should start with the most pressing issue for me. I need 64-bit > indexing > > @Martin, > you indicated that even if I get this up and running, the MPI library > still uses signed 32-bit ints to count (your term), or index (my term) the > recvbuffer lengths. More concretely, > in a call to MPI_Allgatherv( buffer, count, MPI_Integer, recvbuf, recv-count, > displ, MPI_integer, MPI_COMM_WORLD, status, mpierr): count, recvcounts, and > displs must be 32-bit integers, not 64-bit. Actually, all I need is displs > to hold 64-bit values... > If this is true, then compiling OpenMPI this way is not a solution. I'll > have to restructure my code to collect 31-bit chunks... > Not that it matters, but I'm not using DIRAC, but a custom code to compute > circuit analyses. > > @Jeff, > Interesting, your runtime behavior has a different error than mine. You > have problems with the passed variable tempInt, which would make sense for > the reasons you gave. However, my problem involves the fact that the local > variable "rank" gets overwritten by a memory corruption after MPI_RECV is > called. > > Re: config.log. I will try to have the admin guy recompile tomorrow and see > if I can get the log for you. > > BTW, I'm using the gcc 4.7.2 compiler suite on a Rocks 5.4 HPC cluster. I > use the options -m64 and -fdefault-integer-8 > > Cheers, > --Jim > > > > On Wed, Oct 30, 2013 at 7:36 PM, Martin Siegert <sieg...@sfu.ca> wrote: > Hi Jim, > > I have quite a bit experience with compiling openmpi for dirac. > Here is what I use to configure openmpi: > > ./configure --prefix=$instdir \ > --disable-silent-rules \ > --enable-mpirun-prefix-by-default \ > --with-threads=posix \ > --enable-cxx-exceptions \ > --with-tm=$torquedir \ > --with-wrapper-ldflags="-Wl,-rpath,${instdir}/lib" \ > --with-openib \ > --with-hwloc=$hwlocdir \ > CC=gcc \ > CXX=g++ \ > FC="$FC" \ > F77="$FC" \ > CFLAGS="-O3" \ > CXXFLAGS="-O3" \ > FFLAGS="-O3 $I8FLAG" \ > FCFLAGS="-O3 $I8FLAG" > > You need to set FC to either ifort or gfortran (those are the two compilers > that I have used) and set I8FLAG to -fdefault-integer-8 for gfortran or > -i8 for ifort. > Set torquedir to the directory where torque is installed ($torquedir/lib > must contain libtorque.so), if you are running jobs under torque; otherwise > remove the --with-tm=... line. > Set hwlocdir to the directory where you have hwloc installed. You many not > need the -with-hwloc=... option because openmpi comes with a hwloc version > (I don't have experience with that because we install hwloc independently). > Set instdir to the directory where you what to install openmpi. > You may or may not need the --with-openib option depending on whether > you have an Infiniband interconnect. > > After configure/make/make install this so compiled version can be used > with dirac without changing the dirac source code. > (there is one caveat: you should make sure that all "count" variables > in MPI calls in dirac are smaller than 2^31-1. I have run into a few cases > when that is not the case; this problem can be overcome by replacing > MPI_Allreduce calls in dirac with a wrapper that calls MPI_Allreduce > repeatedly). This is what I use to setup dirac: > > export PATH=$instdir/bin > ./setup --prefix=$diracinstdir \ > --fc=mpif90 \ > --cc=mpicc \ > --int64 \ > --explicit-libs="-lmkl_intel_ilp64 -lmkl_sequential -lmkl_core" > > where $instdir is the directory where you installed openmpi from above. > > I would never use the so-compiled openmpi version for anything other > than dirac though. I am not saying that it cannot work (at a minimum > you need to compile Fortran programs with the appropriate I8FLAG), > but it is an unnecessary complication: I have not encountered a piece > of software other than dirac that requires this. > > Cheers, > Martin > > -- > Martin Siegert > Head, Research Computing > WestGrid/ComputeCanada Site Lead > Simon Fraser University > Burnaby, British Columbia > Canada > > On Wed, Oct 30, 2013 at 06:00:56PM -0500, Jim Parker wrote: > > > > Jeff, > > Here's what I know: > > 1. Checked FAQs. Done > > 2. Version 1.6.5 > > 3. config.log file has been removed by the sysadmin... > > 4. ompi_info -a from head node is in attached as headnode.out > > 5. N/A > > 6. compute node info in attached as compute-x-yy.out > > 7. As discussed, local variables are being overwritten after calls to > > MPI_RECV from Fortran code > > 8. ifconfig output from head node and computes listed as *-ifconfig.out > > Cheers, > > --Jim > > > > On Wed, Oct 30, 2013 at 5:29 PM, Jeff Squyres (jsquyres) > > <[1]jsquy...@cisco.com> wrote: > > > > Can you send the information listed here: > > [2]http://www.open-mpi.org/community/help/ > > > > On Oct 30, 2013, at 6:22 PM, Jim Parker <[3]jimparker96...@gmail.com> > > wrote: > > > Jeff and Ralph, > > > Ok, I downshifted to a helloWorld example (attached), bottom line > > after I hit the MPI_Recv call, my local variable (rank) gets borked. > > > > > > I have compiled with -m64 -fdefault-integer-8 and even have assigned > > kind=8 to the integers (which would be the preferred method in my case) > > > > > > Your help is appreciated. > > > > > > Cheers, > > > --Jim > > > > > > > > > > > > On Wed, Oct 30, 2013 at 4:49 PM, Jeff Squyres (jsquyres) > > <[4]jsquy...@cisco.com> wrote: > > > On Oct 30, 2013, at 4:35 PM, Jim Parker <[5]jimparker96...@gmail.com> > > wrote: > > > > > > > I have recently built a cluster that uses the 64-bit indexing > > feature of OpenMPI following the directions at > > > > > > [6]http://wiki.chem.vu.nl/dirac/index.php/How_to_build_MPI_libraries_fo > > r_64-bit_integers > > > > > > That should be correct (i.e., passing -i8 in FFLAGS and FCFLAGS for > > OMPI 1.6.x). > > > > > > > My question is what are the new prototypes for the MPI calls ? > > > > specifically > > > > MPI_RECV > > > > MPI_Allgathterv > > > > > > They're the same as they've always been. > > > > > > The magic is that the -i8 flag tells the compiler "make all Fortran > > INTEGERs be 8 bytes, not (the default) 4." So Ralph's answer was > > correct in that all the MPI parameters are INTEGERs -- but you can tell > > the compiler that all INTEGERs are 8 bytes, not 4, and therefore get > > "large" integers. > > > > > > Note that this means that you need to compile your application with > > -i8, too. That will make *your* INTEGERs also be 8 bytes, and then > > you'll match what Open MPI is doing. > > > > > > > I'm curious because some off my local variables get killed (set to > > null) upon my first call to MPI_RECV. Typically, this is due (in > > Fortran) to someone not setting the 'status' variable to an appropriate > > array size. > > > > > > If you didn't compile your application with -i8, this could well be > > because your application is treating INTEGERs as 4 bytes, but OMPI is > > treating INTEGERs as 8 bytes. Nothing good can come from that. > > > > > > If you *did* compile your application with -i8 and you're seeing this > > kind of wonkyness, we should dig deeper and see what's going on. > > > > > > > My review of mpif.h and mpi.h seem to indicate that the functions > > are defined as C int types and therefore , I assume, the coercion > > during the compile makes the library support 64-bit indexing. ie. int > > -> long int > > > > > > FWIW: We actually define a type MPI_Fint; its actual type is > > determined by configure (int or long int, IIRC). When your Fortran > > code calls C, we use the MPI_Fint type for parameters, and so it will > > be either a 4 or 8 byte integer type. > > > > > > -- > > > Jeff Squyres > > > [7]jsquy...@cisco.com > > > For corporate legal information go to: > > [8]http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > > > _______________________________________________ > > > users mailing list > > > [9]us...@open-mpi.org > > > [10]http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > > <mpi-test-64bit.tar.bz2>____________________________________________ > > ___ > > > > > users mailing list > > > [11]us...@open-mpi.org > > > [12]http://www.open-mpi.org/mailman/listinfo.cgi/users > > -- > > Jeff Squyres > > [13]jsquy...@cisco.com > > For corporate legal information go to: > > [14]http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > > users mailing list > > [15]us...@open-mpi.org > > [16]http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > References > > > > 1. mailto:jsquy...@cisco.com > > 2. http://www.open-mpi.org/community/help/ > > 3. mailto:jimparker96...@gmail.com > > 4. mailto:jsquy...@cisco.com > > 5. mailto:jimparker96...@gmail.com > > 6. > > http://wiki.chem.vu.nl/dirac/index.php/How_to_build_MPI_libraries_for_64-bit_integers > > 7. mailto:jsquy...@cisco.com > > 8. http://www.cisco.com/web/about/doing_business/legal/cri/ > > 9. mailto:us...@open-mpi.org > > 10. http://www.open-mpi.org/mailman/listinfo.cgi/users > > 11. mailto:us...@open-mpi.org > > 12. http://www.open-mpi.org/mailman/listinfo.cgi/users > > 13. mailto:jsquy...@cisco.com > > 14. http://www.cisco.com/web/about/doing_business/legal/cri/ > > 15. mailto:us...@open-mpi.org > > 16. http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > <openmpi-1.6.5.config.tar.gz>_______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/