Re: [OMPI users] compilation problem with ifort
Original Message From: Elio PhysicsSent: Wednesday, September 3, 2014 6:48 PMTo: Open MPI UsersReply To: Open MPI UsersSubject: Re: [OMPI users] compilation problem with ifort I have already done all of the steps you mentioned. I have installed the older version of quantum espresso, configured it and followed all the steps on the EPW website when I got that error in the last steo; In fact I do have the latest version of quantum espresso but since I work with electron phonon and EPW seemed really promising and less time consuming, I decided to give it a try.The reason I have asked my question in this forum because once I had a similar "compiler" issue (not the same as this one) and when i asked on the Quantum Espresso (QE) website, one of the answers was, this is not the right since this is a compiler problem not a QE issue so I was really trying to avoid such answers.Anyhow, I guess you are absolutely right. I will try to e-mail the EPW people and explain the situation; after all they should be able to help. Thanks for your advice and time.ELIO MOUJAESSUniversity of RondoniaBrasil> Date: Wed, 3 Sep 2014 18:19:25 -0400> From: g...@ldeo.columbia.edu> To: us...@open-mpi.org> Subject: Re: [OMPI users] compilation problem with ifort> > It is hard to tell why, but the object files (yes a2f.o, etc)> seem not to have been compiled from the corresponding source files > (a2f.f90 or similar).> > In general the executable (your epw.x) is compiled only after all> the pre-requisite object files (the .o) and modules (the .mod)> have been compiled already.> In many cases there is not only one Makefile, but a chain/tree of> them, to compile the code in the source directory tree (under src).> > Also, it is a bit awkward that you don't seem to> have configured anything,> i.e., telling where MPI was installed, etc,> but that may just not be in your email.> > Phonons is not my league, just trying to help, but afraid I may> not take you in the right direction.> > Did you do the installation as per the EPW web site? (I just found it):> http://epw.org.uk/Main/DownloadAndInstall> It seems to require QuantumExpresso.> Did you get it, configure it, etc?> > Do they have a mailing list or bulletin board where you could get > specific help for their software?> (Either on EPW or on QuantumExpresso (which seems to be required):> http://www.quantum-espresso.org/)> That would probably be the right forum to ask your questions.> > My two cents,> Gus Correa> > > On 09/03/2014 05:51 PM, Elio Physics wrote:> > This was the first error yes. What do you mean other files are missing?> > Do you mean the atom.o, basic_algebra_routines.o...? Well the f90> > files present in the src subdirectory start from a2f.90> > , allocate_epwq.o,...and so on... I am not also sure why there is that> > slash "\" just before the "a2f.o" Is there a way to know what is> > going on? I mean what are the first steps?> >> > Thank you> >> > ELIO MOUJAES> > Univeristy of Rondonia> > Brazil> >> > > Date: Wed, 3 Sep 2014 17:43:44 -0400> > > From: g...@ldeo.columbia.edu> > > To: us...@open-mpi.org> > > Subject: Re: [OMPI users] compilation problem with ifort> > >> > > Was the error that you listed the *first* error?> > >> > > Apparently various object files are missing from the> > > ../../Modules/ directory, and were not compiled,> > > suggesting something is amiss even before the> > > compilation of the executable (epw.x).> > >> > > On 09/03/2014 05:20 PM, Elio Physics wrote:> > > > Dear all,> > > >> > > > I am really a beginner in Fortran and Linux. I was trying to compile a> > > > software (EPW). Everything was going fine (or maybe this is what I> > think):> > > >> > > > mpif90 -o epw.x ../../Modules/atom.o> > > > ../../Modules/basic_algebra_routines.o ../../Modules/cell_base.o> > > > ../../Modules/check_stop.o ../../Modules/clocks.o> > > > ../../Modules/constraints_module.o ../../Modules/control_flags.o> > > > ../../Modules/descriptors.o ../../Modules/dspev_drv.o> > > > ../../Modules/electrons_base.o ../../Modules/error_handler.o> > > >
[OMPI users] Can't read more than 2^31 bytes with MPI_File_read, regardless of type?
Hi: One of our users is reporting trouble reading large files with MPI_File_read (or read_all). With a few different type sizes, to keep count lower than 2^31, the problem persists. A simple C program to test this is attached; we see it in both OpenMPI 1.4.4 and OpenMPI 1.6, with the only difference being the error code returned. We can read the amount of data required by looping over MPI_File_read()s, but in more complicated scenarios this gets awkward. I always thought that the 32-bit signed count limitation wasn't so bad because you could create larger data types to get around it, but this appears not to be the case here. Is this a known problem that we should just work around? Output from ompi_info --all for the 1.4.4 build is also attached. OpenMPI 1.4.4 Trying 268435457 of float, 1073741828 bytes: successfully read 268435457 Trying 536870913 of float, 2147483652 bytes: failed: err=35, MPI_ERR_IO: input/output error Trying 134217729 of double, 1073741832 bytes: successfully read 134217729 Trying 268435457 of double, 2147483656 bytes: failed: err=35, MPI_ERR_IO: input/output error Trying 67108865 of 2xdouble, 1073741840 bytes: successfully read 67108865 Trying 134217729 of 2xdouble, 2147483664 bytes: failed: err=35, MPI_ERR_IO: input/output error Trying 524289 of 256xdouble, 1073743872 bytes: successfully read 524289 Trying 1048577 of 256xdouble, 2147485696 bytes: failed: err=35, MPI_ERR_IO: input/output error Chunk 1/2: Trying 524288 of 256xdouble, chunked, 1073741824 bytes: successfully read 524288 Chunk 2/2: Trying 524289 of 256xdouble, chunked, 1073743872 bytes: successfully read 524289 OpenMPI 1.6 Trying 268435457 of float, 1073741828 bytes: successfully read 268435457 Trying 536870913 of float, 2147483652 bytes: failed: err=13, MPI_ERR_ARG: invalid argument of some other kind Trying 134217729 of double, 1073741832 bytes: successfully read 134217729 Trying 268435457 of double, 2147483656 bytes: failed: err=13, MPI_ERR_ARG: invalid argument of some other kind Trying 67108865 of 2xdouble, 1073741840 bytes: successfully read 67108865 Trying 134217729 of 2xdouble, 2147483664 bytes: failed: err=13, MPI_ERR_ARG: invalid argument of some other kind Trying 524289 of 256xdouble, 1073743872 bytes: successfully read 524289 Trying 1048577 of 256xdouble, 2147485696 bytes: failed: err=13, MPI_ERR_ARG: invalid argument of some other kind Chunk 1/2: Trying 524288 of 256xdouble, chunked, 1073741824 bytes: successfully read 524288 Chunk 2/2: Trying 524289 of 256xdouble, chunked, 1073743872 bytes: successfully read 524289 - Jonathan -- Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca #include #include #include int tryToRead(const MPI_File file, const MPI_Datatype type, const int count, const size_t size, const char *typename, void *buf) { int ierr; MPI_Status status; size_t bufsize = (size_t)count * size; printf("Trying %d of %s, %lu bytes: ", count, typename, bufsize ); ierr = MPI_File_read(file, buf, count, type, ); if ( !ierr ) { int gotcount; MPI_Get_count( , type, ); printf("successfully read %d\n", gotcount); } else { char err[MPI_MAX_ERROR_STRING]; int len; MPI_Error_string(ierr, err, ); printf("failed: err=%d, %s\n", ierr, err); } return ierr; } int tryToReadInChunks(const MPI_File file, const MPI_Datatype type, const int count, const size_t size, const char *typename, void *buf, int nchunks) { int ierr; int nsofar = 0; int chunksize = count / nchunks; char *cbuf = buf; for (int chunk = 0; chunk < nchunks; chunk++ ) { int thischunk = chunksize; if (chunk == nchunks-1) thischunk = count - nsofar; printf("Chunk %d/%d: ", chunk+1, nchunks); ierr = tryToRead(file, type, thischunk, size, typename, &(cbuf[nsofar*size])); if (ierr) break; nsofar += thischunk; } return ierr; } int main(int argc, char *argv[]) { int count; MPI_File fh; MPI_Init(, ); MPI_File_open(MPI_COMM_WORLD, "/dev/zero", MPI_MODE_RDONLY, MPI_INFO_NULL, ); char *buf = malloc( ((size_t)1 << 31)+1024 ); if (buf == NULL) { printf("Malloc failed.\n"); exit(-1); } /* integers */ count = (1 << 28) + 1; tryToRead(fh, MPI_FLOAT, count, sizeof(float), "float", buf); count = (1 << 29) + 1; tryToRead(fh, MPI_FLOAT, count, sizeof(float), "float", buf); /* doubles */ count = (1 << 27) + 1; tryToRead(fh, MPI_DOUBLE, count, sizeof(double), "double", buf); count = (1 << 28) + 1; tryToRead(fh, MPI_DOUBLE, count, sizeof(double), "double", buf); /* 2 x doubles */ MPI_Datatype TwoDoubles; MPI_Type_contiguous(2, MPI_DOUBLE, ); MPI_Type_
Re: [OMPI users] possible bug exercised by mpi4py
It seems like this might also be an issue for gatherv and reduce_scatter as well. - Jonathan -- Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca
Re: [OMPI users] possible bug exercised by mpi4py
On 23 May 9:37PM, Jonathan Dursi wrote: On the other hand, it works everywhere if I pad the rcounts array with an extra valid value (0 or 1, or for that matter 783), or replace the allgatherv with an allgather. .. and it fails with 7 even where it worked (but succeeds with 8) if I pad rcounts with an extra invalid value which should never be read. Should the recvcounts[] parameters test in allgatherv.c loop up to size=ompi_comm_remote_size(comm), as is done in alltoallv.c, rather than ompi_comm_size(comm) ? That seems to avoid the problem. - Jonathan -- Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca
Re: [OMPI users] possible bug exercised by mpi4py
Fails for me with 1.4.3 with gcc, but works with intel; works with 1.4.4 with gcc or intel; fails with 1.5.5 with either. Succeeds with intelmpi. On the other hand, it works everywhere if I pad the rcounts array with an extra valid value (0 or 1, or for that matter 783), or replace the allgatherv with an allgather. - Jonathan -- Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca
Re: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ...
For what it's worth, 1.4.4 built with the intel 12.1.0.233 compilers has been the default mpi at our centre for over a month and we haven't had any problems... - jonathan -- Jonathan Dursi; SciNet, Compute/Calcul Canada -Original Message- From: Richard Walsh <richard.wa...@csi.cuny.edu> Sender: users-boun...@open-mpi.org List-Post: users@lists.open-mpi.org Date: Tue, 20 Dec 2011 21:14:44 To: Open MPI Users<us...@open-mpi.org> Reply-To: Open MPI Users <us...@open-mpi.org> Subject: Re: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ... All, I have not heard anything back on the inquiry below, so I take it that no one has had any issues with Intel's latest compiler release, or perhaps has not tried it yet. Thanks, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY W: 718-982-3319 M: 612-382-4620 Right, as the world goes, is only in question between equals in power, while the strong do what they can and the weak suffer what they must. -- Thucydides, 400 BC From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] on behalf of Richard Walsh [richard.wa...@csi.cuny.edu] Sent: Friday, December 16, 2011 3:12 PM To: Open MPI Users Subject: [OMPI users] Latest Intel Compilers (ICS, version 12.1.0.233 Build 20110811) issues ... All, Working through a stock rebuild of OpenMPI 1.5.4 and 1.4.4 with the most current compiler suites from both PGI and Intel: 1. PGI, Version 11.10 2. Intel, Version 12.1.0.233 Build 20110811 My 1.5.4 'config.log' header looks like this for Intel: ./configure CC=icc CXX=icpc F77=ifort FC=ifort --with-openib --prefix=/share/apps/openmpi-intel/1.5.4 --with-tm=/share/apps/pbs/11.1.0.111761 and this for PGI: ./configure CC=pgcc CXX=pgCC F77=pgf77 FC=pgf90 --with-openib --prefix=/share/apps/openmpi-pgi/1.5.4 --with-tm=/share/apps/pbs/11.1.0.111761 This configure line has been used successfully before. Configuration, build, and install for BOTH compilers seems to work OK; however, my 'mpicc' build of my basic test program ONLY works with the PGI built version of 'mpicc' for either the 1.4.4 or the 1.5.4 will compile the code. The Intel 1.4.4 and 1.5.4 'mpicc' wrapper-compilers produce an immediate segmentation fault: .[richard.walsh@bob pbs]$ ./compile_it ./compile_it: line 10: 19163 Segmentation fault /share/apps/openmpi-intel/1.5.4/bin/mpicc -o ./hello_mpi.exe hello_mpi.c [richard.walsh@bob pbs]$ [richard.walsh@bob pbs]$ ./compile_it ./compile_it: line 10: 19515 Segmentation fault /share/apps/openmpi-intel/1.4.4/bin/mpicc -o ./hello_mpi.exe hello_mpi.c This Intel stack is from the most recent release of their ICS released in late October before SC11: [richard.walsh@bob pbs]$ icc -V Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1.0.233 Build 20110811 Copyright (C) 1985-2011 Intel Corporation. All rights reserved. [richard.walsh@bob pbs]$ ifort -V Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1.0.233 Build 20110811 Copyright (C) 1985-2011 Intel Corporation. All rights reserved. Has anyone else encountered this problem ... ?? Suggestions ... ?? Thanks, rbw Richard Walsh Parallel Applications and Systems Manager CUNY HPC Center, Staten Island, NY W: 718-982-3319 M: 612-382-4620 Right, as the world goes, is only in question between equals in power, while the strong do what they can and the weak suffer what they must. -- Thucydides, 400 BC Change is in the Air - Smoking in Designated Areas Only in effect.<http://www.csi.cuny.edu/tobaccofree> Tobacco-Free Campus as of July 1, 2012. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users Change is in the Air - Smoking in Designated Areas Only in effect.<http://www.csi.cuny.edu/tobaccofree> Tobacco-Free Campus as of July 1, 2012. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] problems with parallel MPI-IO...
On 19 Jul 9:02AM, Alexander Beck-Ratzka wrote: integer ierr, i, myrank, BUFSIZE, thefile, intsize parameter (BUFSIZE=100) integer buf(BUFSIZE) do i=0,BUFSIZE buf(i) = myrank*BUFSIZE + i print*, 'i =', i, 'myrank =', myrank, 'buf(i)=',buf(i) end do [...] When I am reading the data in again and print them out, I always have: buf(0)=0 If you compile your code with -check bounds and run, you'll get an error pointing out that buf(0) is an illegal access; in Fortran arrays start at 1. - Jonathan -- Jonathan Dursi | SciNet, Compute/Calcul Canada
Re: [OMPI users] difference between single and double precision
On 2010-12-03, at 8:46AM, Jeff Squyres (jsquyres) wrote: > Another option to try is to install the openmx drivers on your system and run > open MPI with mx support. This should be much better perf than tcp. We've tried this on a big GigE cluster (in fact, Brice Goglin was playing with it on our system) -- it's not really an answer. It didn't work past a small number of nodes, and the performance gains were fairly small. IntelMPIs Direct Ethernet Transport did work on larger nodecounts, but again it was a pretty modest effect (few percent decrease in pingpong latencies, no discernable bandwidth improvements). - Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca> SciNet, Compute/Calcul Canada
Re: [OMPI users] mpirun only works when -np <4 (Gus Correa)
Jeff Squyres wrote: Why wouldn't shared memory work right on Nehalem? (That is probably distressing for Mark, Matthew, and other Nehalem owners.) To be clear, we don't know that this is a Nehalem-specific problem. I have definitely had this problem on Harpertown cores. - Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
Re: [OMPI users] Programming Help needed
Hi, Amjad: [...] What I do is that I start non blocking MPI communication at the partition boundary faces (faces shared between any two processors) , and then start computing values on the internal/non-shared faces. When I complete this computation, I put WAITALL to ensure MPI communication completion. Then I do computation on the partition boundary faces (shared-ones). This way I try to hide the communication behind computation. Is it correct? As long as your numerical method allows you to do this (that is, you definitely don't need those boundary values to compute the internal values), then yes, this approach can hide some of the communication costs very effectively. The way I'd program this if I were doing it from scratch would be to do the usual blocking approach (no one computes anything until all the faces are exchanged) first and get that working, then break up the computation step into internal and boundary computations and make sure it still works, and then change the messaging to isends/irecvs/waitalls, and make sure it still works, and only then interleave the two. IMPORTANT: Secondly, if processor A shares 50 faces (on 50 or less elements) with an another processor B then it sends/recvs 50 different messages. So in general if a processors has X number of faces sharing with any number of other processors it sends/recvs that much messages. Is this way has "very much reduced" performance in comparison to the possibility that processor A will send/recv a single-bundle message (containg all 50-faces-data) to process B. Means that in general a processor will only send/recv that much messages as the number of processors neighbour to it. It will send a single bundle/pack of messages to each neighbouring processor. Is their "quite a much difference" between these two approaches? Your individual element faces that are being communicated are likely quite small. It is quite generally the case that bundling many small messages into large messages can significantly improve performance, as you avoid incurring the repeated latency costs of sending many messages. As always, though, the answer is `it depends', and the only way to know is to try it both ways. If you really do hide most of the communications cost with your non-blocking communications, then it may not matter too much. In addition, if you don't know beforehand how much data you need to send/receive, then you'll need a handshaking step which introduces more synchronization and may actually hurt performance, or you'll have to use MPI2 one-sided communications. On the other hand, if this shared boundary doesn't change through the simulation, you could just figure out at start-up time how big the messages will be between neighbours and use that as the basis for the usual two-sided messages. My experience is that there's an excellent chance you'll improve the performance by packing the little messages into fewer larger messages. Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
Re: [OMPI users] collective communications broken on more than 4 cores
On 2009-10-29, at 10:21AM, Vincent Loechner wrote: It seems that the calls to collective communication are not returning for some MPI processes, when the number of processes is greater or equal to 5. It's reproduceable, on two different architectures, with two different versions of OpenMPI (1.3.2 and 1.3.3). It was working correctly with OpenMPI version 1.2.7. Does it work if you turn off the shared memory transport layer; that is, mpirun -n 6 -mca btl ^sm ./testmpi Yes it does, on both my configurations (AMD and Intel processor). So it seems that the shared memory synchronization process is broken. Presumably that is this bug: https://svn.open-mpi.org/trac/ompi/ticket/2043 I also found by trial and error that increasing the number of fifos, eg -mca btl_sm_num_fifos 5 on a 6-processor job, apparently worked around the problem. But yes, something seems broken in OpenMP shared memory transport with gcc 4.4.x. Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
Re: [OMPI users] collective communications broken on more than 4 cores
On 2009-10-29, at 9:57AM, Vincent Loechner wrote: [...] It seems that the calls to collective communication are not returning for some MPI processes, when the number of processes is greater or equal to 5. It's reproduceable, on two different architectures, with two different versions of OpenMPI (1.3.2 and 1.3.3). It was working correctly with OpenMPI version 1.2.7. [...] GCC version : $ mpicc --version gcc (Ubuntu 4.4.1-4ubuntu7) 4.4.1 Does it work if you turn off the shared memory transport layer; that is, mpirun -n 6 -mca btl ^sm ./testmpi ? - Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
[OMPI users] Looped Barrier/Sendrecv hangs with btl sml: OMP1.3.3, 1.3.2, gcc44, intel 11
Hi, Eugene: Thanks for your efforts in reproducing this problem; glad to know it's not just us. I think our solution for now is just to migrate our users to MVAPICH2 and Intel MPI; these MPICH-based systems work for us and our users extremely reliably, and it just looks like OpenMPI isn't ready for real production use on our system. - Jonathan On 2009-09-24, at 4:16PM, Eugene Loh wrote: Jonathan Dursi wrote: So to summarize: OpenMPI 1.3.2 + gcc4.4.0 Test problem with periodic (left neighbour of proc 0 is proc N-1) Sendrecv()s: Default always hangs in Sendrecv after random number of iterations Turning off sm (-mca btl self,tcp) not observed to hang Using -mca btl_sm_num_fifos 5 (for a 6 task job) not observed to hang Using fewer than 5 fifos hangs in Sendrecv after random number of iterations or Finalize OpenMPI 1.3.3 + gcc4.4.0 Test problem with periodic (left neighbour of proc 0 is proc N-1) Sendrecv()s: Default sometimes (~20% of time) hangs in Sendrecv after random number of iterations Turning off sm (-mca btl self,tcp) not observed to hang Using -mca btl_sm_num_fifos 5 (for a 6 task job) not observed to hang Using fewer than 5 fifos but more than 2 not observed to hang Using 2 fifos sometimes (~20% of time) hangs in Finalize or Sendrecv after random number of iterations but sometimes completes OpenMPI 1.3.2 + intel 11.0 compilers We are seeing a problem which we believe to be related; ~1% of certain single-node jobs hang, turning off sm or setting num_fifos to NP-1 eliminates this. I can reproduce this with just Barriers, which keeps the processes all in sync. So, this has nothing to do with processes outrunning one another (which wasn't likely in the first place given that you had Sendrecv calls). The problem is fickle. E.g., building OMPI with -g seems to make the problem go away. I did observe that the sm FIFO would fill up. That's weird since there aren't ever a lot of in-flight messages. I tried adding a line of code that would make a process pause if ever it tried to write to a FIFO that seemed full. That pretty much made the problem go away. So, I guess it's a memory coherency problem: receive clears the FIFO, but writer thinks it's congested. I tried all sorts of GCC compilers. The problem seems to set in with 4.4.0. I don't know what's significant about that. It requires moving to the 2.18 assembler, but I tried the 2.18 assembler with 4.3.3 and that worked okay. I'd think this has to do with GCC 4.4.x, but you say you see the problem with Intel compilers as well. Hmm. Maybe an OMPI problem that's better exposed with GCC 4.4.x? ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
Re: [OMPI users] Random hangs using btl sm with OpenMPI 1.3.2/1.3.3 + gcc4.4?
OpenMPI 1.3.2 + intel 11.0 compilers We are seeing a problem which we believe to be related; ~1% of certain single-node jobs hang, turning off sm or setting num_fifos to NP-1 eliminates this. - Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
Re: [OMPI users] Random hangs using btl sm with OpenMPI 1.3.2/1.3.3 + gcc4.4?
. No hangs. :-\ The version of the compiler *usually* isn't significant, so gcc 4.x should be fine. Yes, the sm flow control issue was a significant fix, but the blocking nature of MPI_SENDRECV means that you shouldn't have run into the problems that were fixed (the main issues had to do with fast senders exhausting resources of slow receivers -- but MPI_SENDRECV is synchronous so the senders should always be matching the speed of the receivers). Just for giggles, what happens if you change if (leftneighbour .eq. -1) then leftneighbour = nprocs-1 endif if (rightneighbour .eq. nprocs) then rightneighbour = 0 endif to if (leftneighbour .eq. -1) then leftneighbour = MPI_PROC_NULL endif if (rightneighbour .eq. nprocs) then rightneighbour = MPI_PROC_NULL endif On Sep 21, 2009, at 5:09 PM, Jonathan Dursi wrote: Continuing the conversation with myself: Google pointed me to Trac ticket #1944, which spoke of deadlocks in looped collective operations; there is no collective operation anywhere in this sample code, but trying one of the suggested workarounds/clues: that is, setting btl_sm_num_fifos to at least (np-1) seems to make things work quite reliably, for both OpenMPI 1.3.2 and 1.3.3; that is, while this mpirun -np 6 -mca btl sm,self ./diffusion-mpi invariably hangs (at random-seeming numbers of iterations) with OpenMPI 1.3.2 and sometimes hangs (maybe 10% of the time, again seemingly randomly) with 1.3.3, mpirun -np 6 -mca btl tcp,self ./diffusion-mpi or mpirun -np 6 -mca btl_sm_num_fifos 5 -mca btl sm,self ./diffusion-mpi always succeeds, with (as one might guess) the second being much faster... Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca> ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
Re: [OMPI users] Random hangs using btl sm with OpenMPI 1.3.2/1.3.3 + gcc4.4?
Continuing the conversation with myself: Google pointed me to Trac ticket #1944, which spoke of deadlocks in looped collective operations; there is no collective operation anywhere in this sample code, but trying one of the suggested workarounds/clues: that is, setting btl_sm_num_fifos to at least (np-1) seems to make things work quite reliably, for both OpenMPI 1.3.2 and 1.3.3; that is, while this mpirun -np 6 -mca btl sm,self ./diffusion-mpi invariably hangs (at random-seeming numbers of iterations) with OpenMPI 1.3.2 and sometimes hangs (maybe 10% of the time, again seemingly randomly) with 1.3.3, mpirun -np 6 -mca btl tcp,self ./diffusion-mpi or mpirun -np 6 -mca btl_sm_num_fifos 5 -mca btl sm,self ./diffusion-mpi always succeeds, with (as one might guess) the second being much faster... Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
Re: [OMPI users] Random hangs using btl sm with OpenMPI 1.3.2/1.3.3 + gcc4.4?
I hate to repost, but I'm still stuck with the problem that, on a completely standard install with a standard gcc compiler, we're getting random hangs with a trivial test program when using the sm btl, and we still have no clues as to how to track down the problem. Using a completely standard build: ./configure --prefix=/scinet/gpc/mpi/openmpi/1.3.2-gcc-v4.4.0-ofed --with-openib make make install with a config.log attached, generating a distribution with an ompi_info --all that is also attached. The very trivial attached program, which just does a series of SENDRECVs rightwards through MPI_COMM_WORLD, hangs extremely reliably when run like so on an 8 core box: mpirun -np 6 -mca btl self,sm ./diffusion-mpi The hanging seems to always occur within the first 500 or so iterations - but sometimes between the 10th and 20th and sometimes not until the late 400s. The hanging occurs both on a new dual socket quad core nehalem box, and an older harpertown machine. Running without sm, however, seems to work fine: mpirun -np 6 -mca btl self,tcp ./diffusion-mpi never gives any problems. Running with OpenMPI 1.3.3, built in the same way, gives the hangs significantly less frequently - it hangs one time out of every ten or so. But obviously this is still far too often to deploy in a production environment. Where should we be looking to track down this problem? - Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca> config.log.gz Description: GNU Zip compressed data Package: Open MPI root@gpc-f101n001 Distribution Open MPI: 1.3.2 Open MPI SVN revision: r21054 Open MPI release date: Apr 21, 2009 Open RTE: 1.3.2 Open RTE SVN revision: r21054 Open RTE release date: Apr 21, 2009 OPAL: 1.3.2 OPAL SVN revision: r21054 OPAL release date: Apr 21, 2009 Ident string: 1.3.2 MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3.2) MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.2) MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3.2) MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.3.2) MCA carto: file (MCA v2.0, API v2.0, Component v1.3.2) MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3.2) MCA timer: linux (MCA v2.0, API v2.0, Component v1.3.2) MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.2) MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3.2) MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.2) MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.2) MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.2) MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3.2) MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.2) MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3.2) MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.2) MCA coll: self (MCA v2.0, API v2.0, Component v1.3.2) MCA coll: sm (MCA v2.0, API v2.0, Component v1.3.2) MCA coll: sync (MCA v2.0, API v2.0, Component v1.3.2) MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3.2) MCA io: romio (MCA v2.0, API v2.0, Component v1.3.2) MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3.2) MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3.2) MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3.2) MCA pml: cm (MCA v2.0, API v2.0, Component v1.3.2) MCA pml: csum (MCA v2.0, API v2.0, Component v1.3.2) MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3.2) MCA pml: v (MCA v2.0, API v2.0, Component v1.3.2) MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3.2) MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3.2) MCA btl: ofud (MCA v2.0, API v2.0, Component v1.3.2) MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.2) MCA btl: self (MCA v2.0, API v2.0, Component v1.3.2) MCA btl: sm (MCA v2.0, API v2.0, Component v1.3.2) MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3.2) MCA topo: unity (MCA v2.0, API v2.0, Component v1.3.2) MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3.2) MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3.2) MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3.2) MCA iof: orted (MCA v2.0, API v2.0, Component v1.3.2) MCA iof: tool (MCA v2.0, API v2.0, Component v1.3.2) MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3.2) MCA odls: default (MCA v2.0, API v2.0, Component v1.3.2) MCA ras: slurm (MC
Re: [OMPI users] Random-ish hangs using btl sm with OpenMPI 1.3.2 + gcc4.4?
One more piece of information: I wrote: Any suggestions? I notice a mention of `improved flow control in sm' in the ChangeLog to 1.3.3; is that a significant bugfix? I've just built 1.3.3 again with gcc 4.4.0 and now the hangs are much less frequent -- if I run the attached sample program 10 times it will be successful 9 or so times. But the hangs still occur. - Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca>
[OMPI users] Random-ish hangs using btl sm with OpenMPI 1.3.2 + gcc4.4?
We have here installed a couple of installations of OpenMPI 1.3.2, and we are having real problems with single-node jobs randomly hanging when using the shared memory BTL, particularly (but perhaps not only) when using the version compiled with gcc 4.4.0. The very trivial attached program, which just does a series of SENDRECVs rightwards through MPI_COMM_WORLD, hangs extremely reliably when run like so on an 8 core box: mpirun -np 6 -mca btl self,sm ./diffusion-mpi (the test example was based on a simple fortran example of MPIing the 1d diffusion equation). The hanging seems to always occur within the first 500 or so iterations - but sometimes between the 10th and 20th and sometimes not until the late 400s. The hanging occurs both on a new dual socket quad core nehalem box, and an older harpertown machine. Running without sm, however, seems to work fine: mpirun -np 6 -mca btl self,tcp ./diffusion-mpi never gives any problems. Any suggestions? I notice a mention of `improved flow control in sm' in the ChangeLog to 1.3.3; is that a significant bugfix? - Jonathan -- Jonathan Dursi <ljdu...@scinet.utoronto.ca> program diffuse implicit none include "mpif.h" integer nsteps parameter (nsteps = 15) integer step real a,b integer ierr integer mpistatus(MPI_STATUS_SIZE) integer nprocs,rank integer leftneighbour, rightneighbour integer tag call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD,nprocs,ierr) leftneighbour = rank-1 if (leftneighbour .eq. -1) then leftneighbour = nprocs-1 endif rightneighbour = rank+1 if (rightneighbour .eq. nprocs) then rightneighbour = 0 endif tag = 1 do step=1, nsteps call MPI_SENDRECV(a,1,MPI_REAL,rightneighbour, & & tag, & & b, 1, MPI_REAL, leftneighbour, & & tag, & & MPI_COMM_WORLD, mpistatus, ierr) if ((rank .eq. 0) .and. (mod(step,10) .eq. 1)) then print *, 'Step = ', step endif enddo call MPI_FINALIZE(ierr) end