[OMPI users] MPI_Comm_spawn issues

2023-04-11 Thread Sergio Iserte via users
Hi, I am evaluating OpenMPI 5.0.0 and I am experiencing a race condition when spawning a different number of processes in different nodes. With: $cat hostfile node00 node01 node02 node03 If I run this code: #include #include #include int main(int argc, char* argv[]){ MPI_Init(&a

[OMPI users] MPI_COMM_SPAWN() cannot spawn across nodes

2021-12-07 Thread Jarunan Panyasantisuk via users
Hi there, I have an issue in OpenMPI 4.0.2 and 4.1.1 that MPI_COMM_SPAWN() cannot spawn across nodes, while I could successfully use this function in OpenMPI 2.1.1 I am testing on a cluster with CentOS 7.9, LSF Batch system, and GCC 6.3.0. I used this code for testing (called it "spawn_examp

Re: [OMPI users] MPI_Comm_spawn: no allocated resources for the application ...

2020-03-16 Thread Ralph Castain via users
Sorry for the incredibly late reply. Hopefully, you have already managed to find the answer. I'm not sure what your comm_spawn command looks like, but it appears you specified the host in it using the "dash_host" info-key, yes? The problem is that this is interpreted the same way as the "-host

[OMPI users] MPI_Comm_spawn: no allocated resources for the application ...

2019-10-25 Thread Mccall, Kurt E. (MSFC-EV41) via users
I am trying to launch a number of manager processes, one per node, and then have each of those managers spawn, on its own same node, a number of workers. For this example, I have 2 managers and 2 workers per manager. I'm following the instructions at this link https://stackoverflow.com/questi

Re: [OMPI users] MPI_Comm_Spawn failure: All nodes already filled

2019-08-07 Thread Ralph Castain via users
was read by OpenMPI.   Is this correct?  Thanks, Kurt  From: Ralph Castain mailto:r...@open-mpi.org> >  Subject: [EXTERNAL] Re: [OMPI users] MPI_Comm_Spawn failure: All nodes already filled  I'm afraid I cannot replicate this problem on OMPI master, so it could be something different ab

[OMPI users] MPI_Comm_Spawn failure: All nodes already filled

2019-08-07 Thread Mccall, Kurt E. (MSFC-EV41) via users
: [OMPI users] MPI_Comm_Spawn failure: All nodes already filled I'm afraid I cannot replicate this problem on OMPI master, so it could be something different about OMPI 4.0.1 or your environment. Can you download and test one of the nightly tarballs from the "master" branch and see i

Re: [OMPI users] MPI_Comm_Spawn failure: All nodes already filled

2019-08-06 Thread Ralph Castain via users
I'm afraid I cannot replicate this problem on OMPI master, so it could be something different about OMPI 4.0.1 or your environment. Can you download and test one of the nightly tarballs from the "master" branch and see if it works for you? https://www.open-mpi.org/nightly/master/ Ralph On Au

[OMPI users] MPI_Comm_Spawn failure: All nodes already filled

2019-08-06 Thread Mccall, Kurt E. (MSFC-EV41) via users
Hi, MPI_Comm_spawn() is failing with the error message "All nodes which are allocated for this job are already filled". I compiled OpenMpi 4.0.1 with the Portland Group C++ compiler, v. 19.5.0, both with and without Torque/Maui support. I thought that not using Torque/Maui support would gi

Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors

2019-03-17 Thread Jeff Hammond
est wishes, > Thomas Pak > > *From: *"Thomas Pak" > *To: *users@lists.open-mpi.org > *Sent: *Friday, 7 December, 2018 17:51:29 > *Subject: *[OMPI users] MPI_Comm_spawn leads to pipe leak and other errors > > Dear all, > > My MPI application spawns a lar

Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors

2019-03-17 Thread Gilles Gouaillardet
> To: Open MPI Users > Cc: Open MPI Users > Subject: Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors > > Dear Jeff, > > I did find a way to circumvent this issue for my specific application by > spawning less frequently. However, I wanted to at least b

Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors

2019-03-17 Thread Riebs, Andy
pen MPI Users Subject: Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors Dear Jeff, I did find a way to circumvent this issue for my specific application by spawning less frequently. However, I wanted to at least bring attention to this issue for the OpenMPI community, as it can

Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors

2019-03-16 Thread Ralph H Castain
that there is a fundamental flaw in how OpenMPI handles dynamic > process creation. > > Best wishes, > Thomas Pak > > From: "Thomas Pak" <mailto:thomas@maths.ox.ac.uk>> > To: users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > Sent

Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors

2019-03-16 Thread Thomas Pak
enMPI handles > > dynamic process creation. > > > > Best wishes, > > Thomas Pak > > > > > > From: "Thomas Pak" > (mailto:thomas@maths.ox.ac.uk)> > > To: users@lists.open-mpi.org (mailto:users@lists.open-mpi.org) > > Sent: F

Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors

2019-03-16 Thread Jeff Hammond
c process creation. > > Best wishes, > Thomas Pak > > -- > *From: *"Thomas Pak" > *To: *users@lists.open-mpi.org > *Sent: *Friday, 7 December, 2018 17:51:29 > *Subject: *[OMPI users] MPI_Comm_spawn leads to pipe leak and other errors > &

Re: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors

2019-03-16 Thread Thomas Pak
: "Thomas Pak" To: users@lists.open-mpi.org Sent: Friday, 7 December, 2018 17:51:29 Subject: [OMPI users] MPI_Comm_spawn leads to pipe leak and other errors Dear all, My MPI application spawns a large number of MPI processes using MPI_Comm_spawn over its total lifetime. Unfortunate

[OMPI users] MPI_Comm_spawn leads to pipe leak and other errors

2018-12-07 Thread Thomas Pak
Dear all, My MPI application spawns a large number of MPI processes using MPI_Comm_spawn over its total lifetime. Unfortunately, I have experienced that this results in problems for all currently supported OpenMPI versions (2.1, 3.0, 3.1 and 4.0). I have written a short, self-contained program

Re: [OMPI users] MPI_Comm_spawn question

2017-02-04 Thread Gilles Gouaillardet
Andrew, the 2 seconds timeout is very likely a bug that was fixed, so i strongly suggest you give a try to the latest 2.0.2 that was released earlier this week. Ralph is referring an other timeout which is hard coded (fwiw, the MPI standard says nothing about timeout, so we hardcoded one to preve

Re: [OMPI users] MPI_Comm_spawn question

2017-02-03 Thread r...@open-mpi.org
We know v2.0.1 has problems with comm_spawn, and so you may be encountering one of those. Regardless, there is indeed a timeout mechanism in there. It was added because people would execute a comm_spawn, and then would hang and eat up their entire allocation time for nothing. In v2.0.2, I see i

Re: [OMPI users] MPI_Comm_spawn question

2017-02-01 Thread elistratovaa
I am using Open MPI version 2.0.1. ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_Comm_spawn question

2017-01-31 Thread r...@open-mpi.org
What version of OMPI are you using? > On Jan 31, 2017, at 7:33 AM, elistrato...@info.sgu.ru wrote: > > Hi, > > I am trying to write trivial master-slave program. Master simply creates > slaves, sends them a string, they print it out and exit. Everything works > just fine, however, when I add a d

[OMPI users] MPI_Comm_spawn question

2017-01-31 Thread elistratovaa
Hi, I am trying to write trivial master-slave program. Master simply creates slaves, sends them a string, they print it out and exit. Everything works just fine, however, when I add a delay (more than 2 sec) before calling MPI_Init on slave, MPI fails with MPI_ERR_SPAWN. I am pretty sure that MPI_

Re: [OMPI users] MPI_Comm_spawn

2016-09-29 Thread Cabral, Matias A
PSM_DEVICES -> TrueScale From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of r...@open-mpi.org Sent: Thursday, September 29, 2016 7:12 AM To: Open MPI Users Subject: Re: [OMPI users] MPI_Comm_spawn Ah, that may be why it wouldn’t show up in the OMPI code base itself. If that i

[OMPI users] MPI_Comm_spawn

2016-09-29 Thread juraj2...@gmail.com
The solution was to use the "tcp", "sm" and "self" BTLs for the transport of MPI messages, with TCP restricting only the eth0 interface to communicate and using ob1 as p2p management layer: mpirun --mca btl_tcp_if_include eth0 --mca pml ob1 --mca btl tcp,sm,self -np 1 --hostfile my_hosts ./manager

Re: [OMPI users] MPI_Comm_spawn

2016-09-29 Thread r...@open-mpi.org
Ah, that may be why it wouldn’t show up in the OMPI code base itself. If that is the case here, then no - OMPI v2.0.1 does not support comm_spawn for PSM. It is fixed in the upcoming 2.0.2 > On Sep 29, 2016, at 6:58 AM, Gilles Gouaillardet > wrote: > > Ralph, > > My guess is that ptl.c comes

Re: [OMPI users] MPI_Comm_spawn

2016-09-29 Thread Gilles Gouaillardet
Ralph, My guess is that ptl.c comes from PSM lib ... Cheers, Gilles On Thursday, September 29, 2016, r...@open-mpi.org wrote: > Spawn definitely does not work with srun. I don’t recognize the name of > the file that segfaulted - what is “ptl.c”? Is that in your manager program? > > > On Sep 2

Re: [OMPI users] MPI_Comm_spawn

2016-09-29 Thread r...@open-mpi.org
Spawn definitely does not work with srun. I don’t recognize the name of the file that segfaulted - what is “ptl.c”? Is that in your manager program? > On Sep 29, 2016, at 6:06 AM, Gilles Gouaillardet > wrote: > > Hi, > > I do not expect spawn can work with direct launch (e.g. srun) > > Do y

Re: [OMPI users] MPI_Comm_spawn

2016-09-29 Thread Gilles Gouaillardet
Hi, I do not expect spawn can work with direct launch (e.g. srun) Do you have PSM (e.g. Infinipath) hardware ? That could be linked to the failure Can you please try mpirun --mca pml ob1 --mca btl tcp,sm,self -np 1 --hostfile my_hosts ./manager 1 and see if it help ? Note if you have the poss

[OMPI users] MPI_Comm_spawn

2016-09-29 Thread juraj2...@gmail.com
Hello, I am using MPI_Comm_spawn to dynamically create new processes from single manager process. Everything works fine when all the processes are running on the same node. But imposing restriction to run only a single process per node does not work. Below are the errors produced during multinode

Re: [OMPI users] MPI_Comm_spawn and shared memory

2015-05-14 Thread Radoslaw Martyniszyn
Hi Gilles, Thanks for your answer. BR, Radek On Thu, May 14, 2015 at 9:12 AM, Gilles Gouaillardet wrote: > This is a known limitation of the sm btl. > > FWIW, the vader btl (available in Open MPI 1.8) has the same limitation, > thought i heard there are some works in progress to get rid of this

Re: [OMPI users] MPI_Comm_spawn and shared memory

2015-05-14 Thread Gilles Gouaillardet
This is a known limitation of the sm btl. FWIW, the vader btl (available in Open MPI 1.8) has the same limitation, thought i heard there are some works in progress to get rid of this limitation. Cheers, Gilles On 5/14/2015 3:52 PM, Radoslaw Martyniszyn wrote: Dear developers of Open MPI, I

[OMPI users] MPI_Comm_spawn and shared memory

2015-05-14 Thread Radoslaw Martyniszyn
Dear developers of Open MPI, I've created two applications: parent and child. Parent spawns children using MPI_Comm_spawn. I would like to use shared memory when they communicate. However, applications do not start when I try using sm. Please comment on that issue. If this feature is not supported

Re: [OMPI users] mpi_comm_spawn question

2014-07-03 Thread Milan Hodoscek
> "George" == George Bosilca writes: George> Why are you using system() the second time ? As you want George> to spawn an MPI application calling MPI_Call_spawn would George> make everything simpler. Yes, this works! Very good trick... The system routine would be more flexible, b

Re: [OMPI users] mpi_comm_spawn question

2014-07-03 Thread George Bosilca
Why are you using system() the second time ? As you want to spawn an MPI application calling MPI_Call_spawn would make everything simpler. George On Jul 3, 2014 4:34 PM, "Milan Hodoscek" wrote: > > Hi, > > I am trying to run the following setup in fortran without much > success: > > I have an MP

Re: [OMPI users] mpi_comm_spawn question

2014-07-03 Thread Ralph Castain
Unfortunately, that has never been supported. The problem is that the embedded mpirun picks up all those MCA params that were provided to the original application process, and gets hopelessly confused. We have tried in the past to figure out a solution, but it has proved difficult to separate th

[OMPI users] mpi_comm_spawn question

2014-07-03 Thread Milan Hodoscek
Hi, I am trying to run the following setup in fortran without much success: I have an MPI program, that uses mpi_comm_spawn which spawns some interface program that communicates with the one that spawned it. This spawned program then prepares some data and uses call system() statement in fortran.

Re: [OMPI users] MPI_Comm_spawn and exported variables

2013-12-20 Thread Ralph Castain
Funny, but I couldn't find the code path that supported that in the latest 1.6 series release (didn't check earlier ones) - but no matter, it seems logical enough. Fixed in the trunk and cmr'd to 1.7.4 Thanks! Ralph On Dec 19, 2013, at 8:08 PM, Tim Miller wrote: > Hi Ralph, > > That's correc

Re: [OMPI users] MPI_Comm_spawn and exported variables

2013-12-19 Thread Tim Miller
Hi Ralph, That's correct. All of the original processes see the -x values, but spawned ones do not. Regards, Tim On Thu, Dec 19, 2013 at 6:09 PM, Ralph Castain wrote: > > On Dec 19, 2013, at 2:57 PM, Tim Miller wrote: > > > Hi All, > > > > I have a question similar (but not identical to) the

Re: [OMPI users] MPI_Comm_spawn and exported variables

2013-12-19 Thread Ralph Castain
On Dec 19, 2013, at 2:57 PM, Tim Miller wrote: > Hi All, > > I have a question similar (but not identical to) the one asked by Tom Fogel a > week or so back... > > I have a code that uses MPI_Comm_spawn to launch different processes. The > executables for these use libraries in non-standard

[OMPI users] MPI_Comm_spawn and exported variables

2013-12-19 Thread Tim Miller
Hi All, I have a question similar (but not identical to) the one asked by Tom Fogel a week or so back... I have a code that uses MPI_Comm_spawn to launch different processes. The executables for these use libraries in non-standard locations, so what I've done is add the directories containing the

Re: [OMPI users] MPI_Comm_spawn and exit of parent process.

2012-06-18 Thread Ralph Castain
One further point that I missed in my earlier note: if you are starting the parent as a singleton, then you are fooling yourself about the "without mpirun" comment. A singleton immediately starts a local daemon to act as mpirun so that comm_spawn will work. Otherwise, there is no way to launch t

Re: [OMPI users] MPI_Comm_spawn and exit of parent process.

2012-06-18 Thread TERRY DONTJE
On 6/16/2012 8:03 AM, Roland Schulz wrote: Hi, I would like to start a single process without mpirun and then use MPI_Comm_spawn to start up as many processes as required. I don't want the parent process to take up any resources, so I tried to disconnect the inter communicator and then finali

Re: [OMPI users] MPI_Comm_spawn and exit of parent process.

2012-06-16 Thread Ralph Castain
I'm afraid there is no option to keep the job alive if the parent exits. I could give you several reasons for that behavior, but the bottom line is that it can't be changed. Why don't you have the parent loop across "sleep", waking up periodically to check for a "we are done" message from a chi

[OMPI users] MPI_Comm_spawn and exit of parent process.

2012-06-16 Thread Roland Schulz
Hi, I would like to start a single process without mpirun and then use MPI_Comm_spawn to start up as many processes as required. I don't want the parent process to take up any resources, so I tried to disconnect the inter communicator and then finalize mpi and exit the parent. But as soon as I do

[OMPI users] MPI_Comm_spawn problem

2011-12-05 Thread Fernanda Oliveira
Hi, I'm working with MPI_Comm_spawn and I have some error messages. The code is relatively simple: #include #include #include #include #include int main(int argc, char ** argv){ int i; int rank, size, child_rank; char nomehost[20]; MPI_Comm parent, intercom

Re: [OMPI users] MPI_Comm_Spawn intercommunication

2011-01-22 Thread Jeff Squyres
Try using MPI_COMM_REMOTE_SIZE to get the size of the remote group in an intercommunicator. MPI_COMM_SIZE returns the size of the local group. On Jan 7, 2011, at 6:22 PM, Pierre Chanial wrote: > Hello, > > When I run this code: > > program testcase > > use mpi > implicit none > >

[OMPI users] MPI_Comm_Spawn intercommunication

2011-01-07 Thread Pierre Chanial
Hello, When I run this code: program testcase use mpi implicit none integer :: rank, lsize, rsize, code integer :: intercomm call MPI_INIT(code) call MPI_COMM_GET_PARENT(intercomm, code) if (intercomm == MPI_COMM_NULL) then call MPI_COMM_SPAWN ("./testcase"

Re: [OMPI users] mpi_comm_spawn have problems with group communicators

2010-10-04 Thread Milan Hodoscek
> "Ralph" == Ralph Castain writes: Ralph> On Oct 4, 2010, at 10:36 AM, Milan Hodoscek wrote: >>> "Ralph" == Ralph Castain writes: >> Ralph> I'm not sure why the group communicator would make a Ralph> difference - the code area in question knows nothing about Ral

Re: [OMPI users] mpi_comm_spawn have problems with group communicators

2010-10-04 Thread Ralph Castain
On Oct 4, 2010, at 10:36 AM, Milan Hodoscek wrote: >> "Ralph" == Ralph Castain writes: > >Ralph> I'm not sure why the group communicator would make a >Ralph> difference - the code area in question knows nothing about >Ralph> the mpi aspects of the job. It looks like you are hitt

Re: [OMPI users] mpi_comm_spawn have problems with group communicators

2010-10-04 Thread Milan Hodoscek
> "Ralph" == Ralph Castain writes: Ralph> I'm not sure why the group communicator would make a Ralph> difference - the code area in question knows nothing about Ralph> the mpi aspects of the job. It looks like you are hitting a Ralph> race condition that causes a particular in

Re: [OMPI users] mpi_comm_spawn have problems with group communicators

2010-10-04 Thread Ralph Castain
I'm not sure why the group communicator would make a difference - the code area in question knows nothing about the mpi aspects of the job. It looks like you are hitting a race condition that causes a particular internal recv to not exist when we subsequently try to cancel it, which generates th

[OMPI users] mpi_comm_spawn have problems with group communicators

2010-10-03 Thread Milan Hodoscek
Hi, I am a long time happy user of mpi_comm_spawn() routine. But so far I used it only with the MPI_COMM_WORLD communicator. Now I want to execute more mpi_comm_spawn() routines, by creating and using group communicators. However this seems to have some problems. I can get it to run about 50% time

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-18 Thread Nicolas Bock
Hi Ralph, I have confirmed that openmpi-1.4a1r22335 works with my master, slave example. The temporary directories are cleaned up properly. Thanks for the help! nick On Thu, Dec 17, 2009 at 13:38, Nicolas Bock wrote: > Ok, I'll give it a try. > > Thanks, nick > > > > On Thu, Dec 17, 2009 at

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-17 Thread Nicolas Bock
Ok, I'll give it a try. Thanks, nick On Thu, Dec 17, 2009 at 12:44, Ralph Castain wrote: > In case you missed it, this patch should be in the 1.4 nightly tarballs - > feel free to test and let me know what you find. > > Thanks > Ralph > > On Dec 2, 2009, at 10:06 PM, Nicolas Bock wrote: > > Th

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-17 Thread Ralph Castain
In case you missed it, this patch should be in the 1.4 nightly tarballs - feel free to test and let me know what you find. Thanks Ralph On Dec 2, 2009, at 10:06 PM, Nicolas Bock wrote: > That was quick. I will try the patch as soon as you release it. > > nick > > > On Wed, Dec 2, 2009 at 21:

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Nicolas Bock
On Fri, Dec 4, 2009 at 12:10, Eugene Loh wrote: > Nicolas Bock wrote: > > On Fri, Dec 4, 2009 at 10:29, Eugene Loh wrote: > >> I think you might observe a world of difference if the master issued some >> non-blocking call and then intermixed MPI_Test calls with sleep calls. You >> should see *

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Eugene Loh
Nicolas Bock wrote: On Fri, Dec 4, 2009 at 10:29, Eugene Loh wrote: I think you might observe a world of difference if the master issued some non-blocking call and then intermixed MPI_Test calls with sleep calls.  You should see *much* more subservient behavior. 

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Nicolas Bock
On Fri, Dec 4, 2009 at 10:29, Eugene Loh wrote: > Nicolas Bock wrote: > > On Fri, Dec 4, 2009 at 10:10, Eugene Loh wrote: > >> Yield helped, but not as effectively as one might have imagined. >> > > Yes, that's the impression I get as well, the master process might be > yielding, but it doesn't

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Eugene Loh
Nicolas Bock wrote: On Fri, Dec 4, 2009 at 10:10, Eugene Loh wrote: Yield helped, but not as effectively as one might have imagined. Yes, that's the impression I get as well, the master process might be yielding, but it doesn't appear to be a lot. Maybe

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Nicolas Bock
On Fri, Dec 4, 2009 at 10:10, Eugene Loh wrote: > Nicolas Bock wrote: > > On Fri, Dec 4, 2009 at 08:21, Ralph Castain wrote: > >> You used it correctly. Remember, all that cpu number is telling you is the >> percentage of use by that process. So bottom line is: we are releasing it as >> much as

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Eugene Loh
Nicolas Bock wrote: On Fri, Dec 4, 2009 at 08:21, Ralph Castain wrote: You used it correctly. Remember, all that cpu number is telling you is the percentage of use by that process. So bottom line is: we are releasing it as much as we possibly can, but no other proc

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Nicolas Bock
On Fri, Dec 4, 2009 at 08:21, Ralph Castain wrote: > You used it correctly. Remember, all that cpu number is telling you is the > percentage of use by that process. So bottom line is: we are releasing it as > much as we possibly can, but no other process wants to use the cpu, so we go > ahead and

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Ralph Castain
You used it correctly. Remember, all that cpu number is telling you is the percentage of use by that process. So bottom line is: we are releasing it as much as we possibly can, but no other process wants to use the cpu, so we go ahead and use it. If any other process wanted it, then the percent

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Nicolas Bock
On Fri, Dec 4, 2009 at 08:03, Ralph Castain wrote: > > > It is polling at the barrier. This is done aggressively by default for > performance. You can tell it to be less aggressive if you want via the > yield_when_idle mca param. > > How do I use this parameter correctly? I tried /usr/local/open

Re: [OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Ralph Castain
On Dec 4, 2009, at 7:46 AM, Nicolas Bock wrote: > Hello list, > > when I run the attached example, which spawns a "slave" process with > MPI_Comm_spawn(), I see the following: > > nbock19911 0.0 0.0 53980 2288 pts/0S+ 07:42 0:00 > /usr/local/openmpi-1.3.4-gcc-4.4.2/bin/mpirun

[OMPI users] MPI_Comm_spawn, caller uses CPU while waiting for spawned processes

2009-12-04 Thread Nicolas Bock
Hello list, when I run the attached example, which spawns a "slave" process with MPI_Comm_spawn(), I see the following: nbock19911 0.0 0.0 53980 2288 pts/0S+ 07:42 0:00 /usr/local/openmpi-1.3.4-gcc-4.4.2/bin/mpirun -np 3 ./master nbock19912 92.1 0.0 158964 3868 pts/0R+

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-03 Thread Nicolas Bock
That was quick. I will try the patch as soon as you release it. nick On Wed, Dec 2, 2009 at 21:06, Ralph Castain wrote: > Patch is built and under review... > > Thanks again > Ralph > > On Dec 2, 2009, at 5:37 PM, Nicolas Bock wrote: > > Thanks > > On Wed, Dec 2, 2009 at 17:04, Ralph Castain

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Ralph Castain
Patch is built and under review... Thanks again Ralph On Dec 2, 2009, at 5:37 PM, Nicolas Bock wrote: > Thanks > > On Wed, Dec 2, 2009 at 17:04, Ralph Castain wrote: > Yeah, that's the one all right! Definitely missing from 1.3.x. > > Thanks - I'll build a patch for the next bug-fix release >

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Nicolas Bock
Thanks On Wed, Dec 2, 2009 at 17:04, Ralph Castain wrote: > Yeah, that's the one all right! Definitely missing from 1.3.x. > > Thanks - I'll build a patch for the next bug-fix release > > > On Dec 2, 2009, at 4:37 PM, Abhishek Kulkarni wrote: > > > On Wed, Dec 2, 2009 at 5:00 PM, Ralph Castain

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Ralph Castain
Yeah, that's the one all right! Definitely missing from 1.3.x. Thanks - I'll build a patch for the next bug-fix release On Dec 2, 2009, at 4:37 PM, Abhishek Kulkarni wrote: > On Wed, Dec 2, 2009 at 5:00 PM, Ralph Castain wrote: >> Indeed - that is very helpful! Thanks! >> Looks like we aren't

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Abhishek Kulkarni
On Wed, Dec 2, 2009 at 5:00 PM, Ralph Castain wrote: > Indeed - that is very helpful! Thanks! > Looks like we aren't cleaning up high enough - missing the directory level. > I seem to recall seeing that error go by and that someone fixed it on our > devel trunk, so this is likely a repair that did

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Ralph Castain
Indeed - that is very helpful! Thanks! Looks like we aren't cleaning up high enough - missing the directory level. I seem to recall seeing that error go by and that someone fixed it on our devel trunk, so this is likely a repair that didn't get moved over to the release branch as it should have

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Nicolas Bock
On Wed, Dec 2, 2009 at 14:23, Ralph Castain wrote: > Hmmif you are willing to keep trying, could you perhaps let it run for > a brief time, ctrl-z it, and then do an ls on a directory from a process > that has already terminated? The pids will be in order, so just look for an > early number (

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Ralph Castain
Hmmif you are willing to keep trying, could you perhaps let it run for a brief time, ctrl-z it, and then do an ls on a directory from a process that has already terminated? The pids will be in order, so just look for an early number (not mpirun or the parent, of course). It would help if yo

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Nicolas Bock
On Wed, Dec 2, 2009 at 12:12, Ralph Castain wrote: > > On Dec 2, 2009, at 10:24 AM, Nicolas Bock wrote: > > > > On Tue, Dec 1, 2009 at 20:58, Nicolas Bock wrote: > >> >> >> On Tue, Dec 1, 2009 at 18:03, Ralph Castain wrote: >> >>> You may want to check your limits as defined by the shell/system

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Ralph Castain
On Dec 2, 2009, at 10:24 AM, Nicolas Bock wrote: > > > On Tue, Dec 1, 2009 at 20:58, Nicolas Bock wrote: > > > On Tue, Dec 1, 2009 at 18:03, Ralph Castain wrote: > You may want to check your limits as defined by the shell/system. I can also > run this for as long as I'm willing to let it r

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-02 Thread Nicolas Bock
On Tue, Dec 1, 2009 at 20:58, Nicolas Bock wrote: > > > On Tue, Dec 1, 2009 at 18:03, Ralph Castain wrote: > >> You may want to check your limits as defined by the shell/system. I can >> also run this for as long as I'm willing to let it run, so something else >> appears to be going on. >> >> >>

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Nicolas Bock
On Tue, Dec 1, 2009 at 18:03, Ralph Castain wrote: > You may want to check your limits as defined by the shell/system. I can > also run this for as long as I'm willing to let it run, so something else > appears to be going on. > > > Is that with 1.3.3? I found that with 1.3.4 I can run the exampl

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Ralph Castain
You may want to check your limits as defined by the shell/system. I can also run this for as long as I'm willing to let it run, so something else appears to be going on. On Dec 1, 2009, at 4:38 PM, Nicolas Bock wrote: > > > On Tue, Dec 1, 2009 at 16:28, Abhishek Kulkarni wrote: > On Tue, De

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Nicolas Bock
On Tue, Dec 1, 2009 at 16:28, Abhishek Kulkarni wrote: > On Tue, Dec 1, 2009 at 6:15 PM, Nicolas Bock > wrote: > > After reading Anthony's question again, I am not sure now that we are > having > > the same problem, but we might. In any case, the attached example > programs > > trigger the issue

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Abhishek Kulkarni
On Tue, Dec 1, 2009 at 6:15 PM, Nicolas Bock wrote: > After reading Anthony's question again, I am not sure now that we are having > the same problem, but we might. In any case, the attached example programs > trigger the issue of running out of pipes. I don't see how orted could, even > if it was

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Nicolas Bock
Linux mujo 2.6.30-gentoo-r5 #1 SMP PREEMPT Thu Sep 17 07:47:12 MDT 2009 x86_64 Intel(R) Core(TM)2 Quad CPU Q8200 @ 2.33GHz GenuineIntel GNU/Linux On Tue, Dec 1, 2009 at 16:24, Ralph Castain wrote: > It really does help if we have some idea what OMPI version you are talking > about, and on what k

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Nicolas Bock
Sorry, openmpi-1.3.3 compiled with gcc-4.4.2 nick On Tue, Dec 1, 2009 at 16:24, Ralph Castain wrote: > It really does help if we have some idea what OMPI version you are talking > about, and on what kind of platform. > > This issue was fixed to the best of my knowledge (not all the pipes were

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Ralph Castain
It really does help if we have some idea what OMPI version you are talking about, and on what kind of platform. This issue was fixed to the best of my knowledge (not all the pipes were getting closed), but I would have to look and see what release might contain the fix...would be nice to know w

Re: [OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Nicolas Bock
After reading Anthony's question again, I am not sure now that we are having the same problem, but we might. In any case, the attached example programs trigger the issue of running out of pipes. I don't see how orted could, even if it was reused. There is only a very limited number of processes run

[OMPI users] MPI_Comm_spawn lots of times

2009-12-01 Thread Nicolas Bock
Hello list, a while back in January of this year, a user (Anthony Thevenin) had the problem of running out of open pipes when he tried to use MPI_Comm_spawn a few times. As I the thread his started in the mailing list archives and have just joined the mailing list myself, I unfortunately can't rep

Re: [OMPI users] MPI_Comm_spawn query

2009-09-26 Thread Jeff Squyres
On Sep 22, 2009, at 8:20 AM, Blesson Varghese wrote: I am fairly new to MPI.I have a few queries regarding spawning processes that I am listing below: a. How can processes send data to a spawned process? See the descriptions for MPI_COMM_SPAWN; you get an inter-communicator back from

[OMPI users] MPI_Comm_spawn query

2009-09-22 Thread Blesson Varghese
Hi, I am fairly new to MPI.I have a few queries regarding spawning processes that I am listing below: a. How can processes send data to a spawned process? b. Can any process (that is not a parent process) send data to a spawned process? c. Can MPI_Send or MPI_Recv be used to

Re: [OMPI users] MPI_Comm_spawn and oreted

2009-04-16 Thread Jerome BENOIT
Thanks for the info. meanwhile I have set: mpi_param_check = 0 in my system-wide configuation file on workers and mpi_param_check = 1 on the master. Jerome Ralph Castain wrote: Thanks! That does indeed help clarify. You should also then configure OMPI with --disable-per-user-config-f

Re: [OMPI users] MPI_Comm_spawn and oreted

2009-04-16 Thread Ralph Castain
Thanks! That does indeed help clarify. You should also then configure OMPI with --disable-per-user-config- files. MPI procs will automatically look at the default MCA parameter file, which is probably on your master node (wherever mpirun was executed). However, they also look at the user's h

Re: [OMPI users] MPI_Comm_spawn and oreted

2009-04-16 Thread Jerome BENOIT
Hi, thanks for the reply. Ralph Castain wrote: The orteds don't pass anything from MPI_Info to srun during a comm_spawn. What the orteds do is to chdir to the specified wdir before spawning the child process to ensure that the child has the correct working directory, then the orted changes ba

Re: [OMPI users] MPI_Comm_spawn and oreted

2009-04-16 Thread Ralph Castain
The orteds don't pass anything from MPI_Info to srun during a comm_spawn. What the orteds do is to chdir to the specified wdir before spawning the child process to ensure that the child has the correct working directory, then the orted changes back to its default working directory. The or

Re: [OMPI users] MPI_Comm_spawn and oreted

2009-04-16 Thread Jerome BENOIT
Hi ! finally I got it: passing the mca key/value `"plm_slurm_args"/"--chdir /local/folder"' does the trick. As a matter of fact, my code pass the MPI_Info key/value `"wdir"/"/local/folder"' to MPI_Comm_spawn as well: the working directories on the nodes of the spawned programs are `nodes:/loc

Re: [OMPI users] MPI_Comm_spawn and oreted

2009-04-16 Thread Jerome BENOIT
Hello Again, Jerome BENOIT wrote: Hello List, I have just noticed that, when MPI_Comm_spawn is used to launch programs around, oreted working directory on the nodes is the working directory of the spawnning program: can we ask to oreted to use an another directory ? Changing the working th

[OMPI users] MPI_Comm_spawn and oreted

2009-04-16 Thread Jerome BENOIT
Hello List, I have just noticed that, when MPI_Comm_spawn is used to launch programs around, oreted working directory on the nodes is the working directory of the spawnning program: can we ask to oreted to use an another directory ? Thanks in advance, Jerome

Re: [OMPI users] MPI_Comm_spawn errors

2008-02-19 Thread Tim Prins
Hi Joao, Unfortunately, spawn is broken on the development trunk right now. We are working on a major revamp of the runtime system which should fix these problems, but it is not ready yet. Sorry about that :( Tim Joao Vicente Lima wrote: Hi all, I'm getting errors with spawn in the situat

[OMPI users] MPI_Comm_spawn errors

2008-02-18 Thread Joao Vicente Lima
Hi all, I'm getting errors with spawn in the situations: 1) spawn1.c - spawning 2 process on localhost, one by one, the error is: spawning ... [localhost:31390] *** Process received signal *** [localhost:31390] Signal: Segmentation fault (11) [localhost:31390] Signal code: Address not mapped (1)

Re: [OMPI users] MPI_Comm_Spawn

2007-04-04 Thread Ralph H Castain
0 > #12 0x4027a748 in mca_oob_tcp_msg_data () from > /usr/local/Mpi/openmpi-1.1.4-noBproc-noThread/lib/openmpi/mca_oob_tcp.so > #13 0x4027bb12 in mca_oob_tcp_peer_recv_handler () from > /usr/local/Mpi/openmpi-1.1.4-noBproc-noThread/lib/openmpi/mca_oob_tcp.so > #14 0x400703f9 in o

Re: [OMPI users] MPI_Comm_Spawn

2007-03-13 Thread Ralph H Castain
nmpi-1.1.4-noBproc-noThread/lib/openmpi/mca_rmgr_urm.so >> #9 0x4004f277 in orte_rmgr_base_cmd_dispatch () from >> /usr/local/Mpi/openmpi-1.1.4-noBproc-noThread/lib/liborte.so.0 >> #10 0x402b10ae in orte_rmgr_urm_recv () from >> /usr/local/Mpi/openmpi-1.1.4-noBproc-noThread/lib/

Re: [OMPI users] MPI_Comm_Spawn

2007-03-06 Thread Ralph Castain
b/openmpi/mca_oob_tcp.so > #13 0x4027bb12 in mca_oob_tcp_peer_recv_handler () from > /usr/local/Mpi/openmpi-1.1.4-noBproc-noThread/lib/openmpi/mca_oob_tcp.so > #14 0x400703f9 in opal_event_loop () from > /usr/local/Mpi/openmpi-1.1.4-noBproc-noThread/lib/libopal.so.0 > #15 0x4006adfa in o

Re: [OMPI users] MPI_Comm_Spawn

2007-03-06 Thread Rozzen . VINCONT
) at main.c:13 (gdb) -Message d'origine- De : users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]De la part de Tim Prins Envoyé : lundi 5 mars 2007 22:34 À : Open MPI Users Objet : Re: [OMPI users] MPI_Comm_Spawn Never mind, I was just able to replicate it. I'll lo

  1   2   >