Re: [OMPI users] Multi-threading with OpenMPI ?
Thank you Dick for your detailed reply, I am sorry, could you explain more what you meant by "unless you are calling MPI_Comm_spawn on a single task communicator you would need to have a different input communicator for each thread that will make an MPI_Comm_spawn call" , i am confused with the term "single task communicator" Best Regards, umanga Richard Treumann wrote: It is dangerous to hold a local lock (like a mutex} across a blocking MPI call unless you can be 100% sure everything that must happen remotely will be completely independent of what is done with local locks & communication dependancies on other tasks. It is likely that a MPI_Comm_spawn call in which the spawning communicator is MPI_COMM_SELF would be safe to serialize with a mutex. But be careful and do not view this as an approach to making MPI applications thread safe in general. Also, unless you are calling MPI_Comm_spawn on a single task communicator you would need to have a different input communicator for each thread that will make an MPI_Comm_spawn call. MPI requires that collective calls on a given communicator be made in the same order by all participating tasks. If there are two or more tasks making the MPI_Comm_spawn call collectively from multiple threads (even with per-thread input communicators) then using a local lock this way is pretty sure to deadlock at some point. Say task 0 serializes spawning threads as A then B and task 1 serializes them as B then A. The job will deadlock because task 0 cannot free its lock for thread A until task 1 makes the spawn call for thread A as well. That will never happen if task 1 is stuck in a lock that will not release until task 0 makes its call for thread B. When you look at the code for a particular task and consider thread interactions within the task, the use of the lock looks safe. It is only when you consider the dependancies on what other tasks are doing that the danger becomes clear. This particular case is pretty easy to see but sometime when there is a temptation to hold a local mutex across an blocking MPI call, the chain of dependancies that can lead to deadlock becomes very hard to predict. BTW - maybe this is obvious but you also need to protect the logic which calls MPI_Thread_init to make sure you do not have a a race in which 2 threads each race to test the flag for whether MPI_Init_thread has already been called. If two thread do: 1) if (MPI_Inited_flag == FALSE) { 2) set MPI_Inited_flag 3) MPI_Init_thread 4) } You have a couple race conditions. 1) Two threads may both try to call MPI_Iint_thread if one thread tests " if (MPI_Inited_flag == FALSE)" while the other is between statements 1 & 2. 2) If some thread tests "if (MPI_Inited_flag == FALSE)" while another thread is between statements 2 and 3, that thread could assume MPI_Init_thread is done and make the MPI_Comm_spawn call before the thread that is trying to initialize MPI manages to do it. Dick Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 users-boun...@open-mpi.org wrote on 09/17/2009 11:36:48 PM: > [image removed] > > Re: [OMPI users] Multi-threading with OpenMPI ? > > Ralph Castain > > to: > > Open MPI Users > > 09/17/2009 11:37 PM > > Sent by: > > users-boun...@open-mpi.org > > Please respond to Open MPI Users > > Only thing I can suggest is to place a thread lock around the call to > comm_spawn so that only one thread at a time can execute that > function. The call to mpi_init_thread is fine - you just need to > explicitly protect the call to comm_spawn. > > ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] (no subject)
dear sir i am sending the details as follows 1. i am using openmpi-1.3.3 and blcr 0.8.2 2. i have installed blcr 0.8.2 first under /root/MS 3. then i installed openmpi 1.3.3 under /root/MS 4 i have configured and installed open mpi as follows #./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib # make # make install then i added the following to the .bash_profile under home directory( i went to home directory by doing cd ~) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko PATH=$PATH:/usr/local/bin MANPATH=$MANPATH:/usr/local/man LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib then i compiled and run the file arr_add.c as follows [root@localhost examples]# mpicc -o res arr_add.c [root@localhost examples]# mpirun -np 2 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 5790 is not checkpointable. This could be due to one of the following: - An application with this PID doesn't currently exist - The application with this PID isn't checkpointable - The application with this PID isn't an OPAL application. We were looking for the named files: /tmp/opal_cr_prog_write.5790 /tmp/opal_cr_prog_read.5790 -- [localhost.localdomain:05788] local) Error: Unable to initiate the handshake with peer [[7788,1],1]. -1 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 567 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 NOTE: the PID of mpirun is 5788 i geve the following command for taking the checkpoint [root@localhost examples]#ompi-checkpoint -s 5788 i got the following output , but it was hanging like this [localhost.localdomain:05796] Requested - Global Snapshot Reference: (null) [localhost.localdomain:05796] Pending - Global Snapshot Reference: (null) [localhost.localdomain:05796] Running - Global Snapshot Reference: (null) kindly rectify it. with regards mallikarjuna shastry
[OMPI users] error in checkpointing in open mpi
dear sir i am sending the details as follows 1. i am using openmpi-1.3.3 and blcr 0.8.2 2. i have installed blcr 0.8.2 first under /root/MS 3. then i installed openmpi 1.3.3 under /root/MS 4 i have configured and installed open mpi as follows #./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib # make # make install then i added the following to the .bash_profile under home directory( i went to home directory by doing cd ~) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko PATH=$PATH:/usr/local/bin MANPATH=$MANPATH:/usr/local/man LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib then i compiled and run the file arr_add.c as follows [root@localhost examples]# mpicc -o res arr_add.c [root@localhost examples]# mpirun -np 2 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 5790 is not checkpointable. This could be due to one of > the following: > - An application with this PID > doesn't currently exist > - The application with this PID > isn't checkpointable > - The application with this PID > isn't an OPAL application. >We were looking for the > named files: > >/tmp/opal_cr_prog_write.5790 > >/tmp/opal_cr_prog_read.5790 > -- > [localhost.localdomain:05788] local) Error: Unable to > initiate the handshake with peer [[7788,1],1]. -1 > [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: > Error in file snapc_full_global.c at line 567 > [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: > Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 NOTE: the PID of mpirun is 5788 i geve the following command for taking the checkpoint [root@localhost examples]#ompi-checkpoint -s 5788 i got the following output , but it was hanging like this [localhost.localdomain:05796] Requested - Global Snapshot Reference: (null) [localhost.localdomain:05796] Pending - Global Snapshot Reference: (null) [localhost.localdomain:05796] Running - Global Snapshot Reference: (null) kindly rectify it. with regards mallikarjuna shastry
Re: [OMPI users] Multi-threading with OpenMPI ?
MPI_COMM_SELF is one example. The only task it contains is the local task. The other case I had in mind is where there is a master doing all spawns. Master is launched as an MPI "job" but it has only one task. In that master, even MPI_COMM_WORLD is what I called a "single task communicator". Because the collective spawn call is "collective: across only one task in this case, it does not have the same sort of dependency on what other tasks do. I think it is common for a single task master to have responsibility for all spawns in the kind of model yours sounds like. I did not study the conversation enough to knew if you are doing all spawn calls from a "single task communicator" and I was trying to give a broadly useful explanation. Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 users-boun...@open-mpi.org wrote on 09/25/2009 02:59:04 AM: > [image removed] > > Re: [OMPI users] Multi-threading with OpenMPI ? > > Ashika Umanga Umagiliya > > to: > > Open MPI Users > > 09/25/2009 03:00 AM > > Sent by: > > users-boun...@open-mpi.org > > Please respond to Open MPI Users > > Thank you Dick for your detailed reply, > > I am sorry, could you explain more what you meant by "unless you are > calling MPI_Comm_spawn on a single task communicator you would need > to have a different input communicator for each thread that will > make an MPI_Comm_spawn call" , i am confused with the term "single > task communicator" > > Best Regards, > umanga > > Richard Treumann wrote: > It is dangerous to hold a local lock (like a mutex} across a > blocking MPI call unless you can be 100% sure everything that must > happen remotely will be completely independent of what is done with > local locks & communication dependancies on other tasks. > > It is likely that a MPI_Comm_spawn call in which the spawning > communicator is MPI_COMM_SELF would be safe to serialize with a > mutex. But be careful and do not view this as an approach to making > MPI applications thread safe in general. Also, unless you are > calling MPI_Comm_spawn on a single task communicator you would need > to have a different input communicator for each thread that will > make an MPI_Comm_spawn call. MPI requires that collective calls on a > given communicator be made in the same order by all participating tasks. > > If there are two or more tasks making the MPI_Comm_spawn call > collectively from multiple threads (even with per-thread input > communicators) then using a local lock this way is pretty sure to > deadlock at some point. Say task 0 serializes spawning threads as A > then B and task 1 serializes them as B then A. The job will deadlock > because task 0 cannot free its lock for thread A until task 1 makes > the spawn call for thread A as well. That will never happen if task > 1 is stuck in a lock that will not release until task 0 makes its > call for thread B. > > When you look at the code for a particular task and consider thread > interactions within the task, the use of the lock looks safe. It is > only when you consider the dependancies on what other tasks are > doing that the danger becomes clear. This particular case is pretty > easy to see but sometime when there is a temptation to hold a local > mutex across an blocking MPI call, the chain of dependancies that > can lead to deadlock becomes very hard to predict. > > BTW - maybe this is obvious but you also need to protect the logic > which calls MPI_Thread_init to make sure you do not have a a race in > which 2 threads each race to test the flag for whether > MPI_Init_thread has already been called. If two thread do: > 1) if (MPI_Inited_flag == FALSE) { > 2) set MPI_Inited_flag > 3) MPI_Init_thread > 4) } > You have a couple race conditions. > 1) Two threads may both try to call MPI_Iint_thread if one thread > tests " if (MPI_Inited_flag == FALSE)" while the other is between > statements 1 & 2. > 2) If some thread tests "if (MPI_Inited_flag == FALSE)" while > another thread is between statements 2 and 3, that thread could > assume MPI_Init_thread is done and make the MPI_Comm_spawn call > before the thread that is trying to initialize MPI manages to do it. > > Dick > > > Dick Treumann - MPI Team > IBM Systems & Technology Group > Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 > Tele (845) 433-7846 Fax (845) 433-8363 > > > users-boun...@open-mpi.org wrote on 09/17/2009 11:36:48 PM: > > > [image removed] > > > > Re: [OMPI users] Multi-threading with OpenMPI ? > > > > Ralph Castain > > > > to: > > > > Open MPI Users > > > > 09/17/2009 11:37 PM > > > > Sent by: > > > > users-boun...@open-mpi.org > > > > Please respond to Open MPI Users > > > > Only thing I can suggest is to place a thread lock around the call to > > comm_spawn so that only one thread at a time can execute that > > function. The call to mpi_init_thread is fine - you just need to > >
[OMPI users] "Failed to find the following executable" problem under Torque
I'm having a problem running OpenMPI under Torque. It complains like there is a command syntax problem, but the three variations below are all correct, best I can tell using mpirun -help. The environment in which the command executes, i.e. PATH and LD_LIBRARY_PATH, is correct. Torque is 2.3.x. OpenMPI is 1.2.8. OFED is 1.4. Somewhere in the FAQ I had read that you must not give -machinefile under Torque with OpenMPI 1.2.8 and you did not need to give -np. That's why I tried variation 3 below without either of these options, but it still fails. Thanks for any help /usr/mpi/intel/openmpi-1.2.8/bin/mpirun -np 28 /tmp/43.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/43.fwnaeglingio/restart.0 -- Failed to find the following executable: Host: n8n26 Executable: -p Cannot continue. mpirun --prefix /usr/mpi/intel/openmpi-1.2.8 --machinefile /var/spool/torque/aux/45.fwnaeglingio -np 28 --mca btl ^tcp --mca mpi_leave_pinned 1 --mca mpool_base_use_mem_hooks 1 -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT /tmp/45.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/45.fwnaeglingio/restart.0 -- Failed to find or execute the following executable: Host: n8n27 Executable: --prefix /usr/mpi/intel/openmpi-1.2.8 Cannot continue. /usr/mpi/intel/openmpi-1.2.8/bin/mpirun -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 /tmp/47.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/47.fwnaeglingio/restart.0 -- Failed to find the following executable: Host: n8n27 Executable: - Cannot continue.
[OMPI users] segfault on finalize
Hi, I'm using r21970 of the trunk on Linux 2.6.18-3-amd64 and gcc version 4.2.3 (Debian 4.2.3-2). When I compile open mpi with the default options, it works. But if I use --with-platform=optimized option, then I get a segfault for every program I run. ==3073== Access not within mapped region at address 0x30 ==3073==at 0x535544D: mca_base_param_finalize (in /home/tropars/open-mpi/install/lib/libopen-pal.so.0.0.0) ==3073==by 0x5339D55: opal_finalize_util (in /home/tropars/open-mpi/install/lib/libopen-pal.so.0.0.0) ==3073==by 0x4E5A228: ompi_mpi_finalize (in /home/tropars/open-mpi/install/lib/libmpi.so.0.0.0) ==3073==by 0x400BF2: main (in /home/tropars/open-mpi/tests/ring) Regards, Thomas
[OMPI users] Help tracing casue of readv errors
One my users recently reported random hangs of his OpenMPI application. I've run some tests using multiple 2-node 16-core runs of the IMB benchmark and can occasionally replicate the problem. Looking through the mail archive, a previous occurrence of this error seems to been suspect code, but as it's IMB failing here, I suspect the problem lies elsewhere. The full set of errors generated by a failed run are: [lancs2-015][[37376,1],2][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],6][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],8][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] [lancs2-015][[37376,1],14][btl_tcp_frag.c: 216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) [lancs2-015][[37376,1],14][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conn ection reset by peer (104) [lancs2-015][[37376,1],4][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],4][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],2][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],6][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],0][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],12][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conn ection reset by peer (104) [lancs2-015][[37376,1],4][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],12][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conn ection reset by peer (104) [lancs2-015][[37376,1],2][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],10][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conn ection reset by peer (104) [lancs2-015][[37376,1],8][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) [lancs2-015][[37376,1],6][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Conne ction reset by peer (104) I'm used to OpenMPI terminating cleanly, but that's not happening in this case. All the OpenMPI processes on one node terminate, while the processes on the other simply spin with 100% CPU utilisation. I've run this 2-node test a number of times, and I'm not seeing any pattern (ie, I can't pin it down to a single node - a subsequent run using the two nodes involved above ran fine). Can anyone provide any pointers in tracking down this problem? System details as follows: - OpenMPI 1.3.3, compiled with gcc version 4.1.2 20080704 (Red Hat 4.1.2-44), using only the -prefix and -with-sge options. - OS is Scientific Linux SL release 5.3 - CPUs are 2.3GHz Opteron 2356 Regards, Mike. - Dr Mike Pacey, Email: m.pa...@lancaster.ac.uk High Performance Systems Support, Phone: 01524 593543 Information Systems Services,Fax: 01524 594459 Lancaster University, Lancaster LA1 4YW