Re: [OMPI users] MPI_Init

2012-08-28 Thread Ralph Castain
On Aug 28, 2012, at 6:47 PM, Tony Raymond wrote: > Hi Ralph, > > Thanks for taking care of this so quick! > > Does this mean that MPI_Init will leave the SIGCHLD handler alone? Yes > Should it be fine to set the handler as I did in the current version of MPI? Yes - no

Re: [OMPI users] MPI_Init

2012-08-28 Thread Tony Raymond
Hi Ralph, Thanks for taking care of this so quick! Does this mean that MPI_Init will leave the SIGCHLD handler alone? Should it be fine to set the handler as I did in the current version of MPI? Thanks, Tony From: users-boun...@open-mpi.org

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Brian Budge
Thanks! On Tue, Aug 28, 2012 at 4:57 PM, Ralph Castain wrote: > Yeah, I'm seeing the hang as well when running across multiple machines. Let > me dig a little and get this fixed. > > Thanks > Ralph > > On Aug 28, 2012, at 4:51 PM, Brian Budge wrote: >

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Ralph Castain
Yeah, I'm seeing the hang as well when running across multiple machines. Let me dig a little and get this fixed. Thanks Ralph On Aug 28, 2012, at 4:51 PM, Brian Budge wrote: > Hmmm, I went to the build directories of openmpi for my two machines, > went into the

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Brian Budge
Hmmm, I went to the build directories of openmpi for my two machines, went into the orte/test/mpi directory and made the executables on both machines. I set the hostsfile in the env variable on the "master" machine. Here's the output:

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Ralph Castain
Looks to me like it didn't find your executable - could be a question of where it exists relative to where you are running. If you look in your OMPI source tree at the orte/test/mpi directory, you'll see an example program "simple_spawn.c" there. Just "make simple_spawn" and execute that with

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Brian Budge
I see. Okay. So, I just tried removing the check for universe size, and set the universe size to 2. Here's my output: LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile ./master_exe [budgeb-interlagos:29965] [[4156,0],0]

Re: [OMPI users] 转发:lwkmpi

2012-08-28 Thread Reuti
There is only one file where "return { ... };" is used. --disable-vt seems to fix it. -- Reuti Am 28.08.2012 um 14:56 schrieb Tim Prince: > On 8/28/2012 5:11 AM, 清风 wrote: >> >> >> >> -- 原始邮 件 -- >> *发件人:* "295187383"<295187...@qq.com>; >> *发送时间:*

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Ralph Castain
I see the issue - it's here: > MPI_Attr_get(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, , ); > > if(!flag) { > std::cerr << "no universe size" << std::endl; > return -1; > } > universeSize = *puniverseSize; > if(universeSize == 1) { > std::cerr << "cannot start slaves... not

Re: [OMPI users] MPI_Init

2012-08-28 Thread Ralph Castain
Okay, I fixed this on our trunk - I'll post it for transfer to the 1.7 and 1.6 series in their next releases. Thanks! On Aug 28, 2012, at 2:27 PM, Ralph Castain wrote: > Oh crud - yes we do. Checking on it... > > On Aug 28, 2012, at 2:23 PM, Ralph Castain

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Brian Budge
>echo hostsfile localhost budgeb-sandybridge Thanks, Brian On Tue, Aug 28, 2012 at 2:36 PM, Ralph Castain wrote: > Hmmm...what is in your "hostsfile"? > > On Aug 28, 2012, at 2:33 PM, Brian Budge wrote: > >> Hi Ralph - >> >> Thanks for confirming

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Ralph Castain
Hmmm...what is in your "hostsfile"? On Aug 28, 2012, at 2:33 PM, Brian Budge wrote: > Hi Ralph - > > Thanks for confirming this is possible. I'm trying this and currently > failing. Perhaps there's something I'm missing in the code to make > this work. Here are the

Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2012-08-28 Thread Brian Budge
Hi Ralph - Thanks for confirming this is possible. I'm trying this and currently failing. Perhaps there's something I'm missing in the code to make this work. Here are the two instantiations and their outputs: > LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib >

Re: [OMPI users] MPI_Init

2012-08-28 Thread Ralph Castain
Oh crud - yes we do. Checking on it... On Aug 28, 2012, at 2:23 PM, Ralph Castain wrote: > Glancing at the code, I don't see anywhere that we trap SIGCHLD outside of > mpirun and the orte daemons - certainly not inside an MPI app. What version > of OMPI are you using? > >

Re: [OMPI users] MPI_Init

2012-08-28 Thread Ralph Castain
Glancing at the code, I don't see anywhere that we trap SIGCHLD outside of mpirun and the orte daemons - certainly not inside an MPI app. What version of OMPI are you using? On Aug 28, 2012, at 2:06 PM, Tony Raymond wrote: > Hi, > > I have an application that uses openMPI

[OMPI users] MPI_Init

2012-08-28 Thread Tony Raymond
Hi, I have an application that uses openMPI and creates some child processes using fork(). I've been trying to catch SIGCHLD in order to check the exit status of these processes so that the program will exit if a child errors out. I've found out that if I set the SIGCHLD handler before

Re: [hwloc-users] lstopo and GPus

2012-08-28 Thread Gabriele Fatigati
Hi, thanks for the reply. How can cuda branch help me? lstopo output of that branch is the same of the trunk. Another question: the GPU IDs are the same (10de: 06d2). How is it possible? Thanks. 2012/8/28 Samuel Thibault > Brice Goglin, le Tue 28 Aug 2012 14:43:53

Re: [OMPI users] deprecated MCA parameter

2012-08-28 Thread Jeff Squyres
Ralph and I talked about this -- it seems like we should extend the help message. If there is no replacement for the param, it should say that. If there is a replacement, it should be listed. We'll take this as a feature enhancement. On Aug 28, 2012, at 9:23 AM, jody wrote: > Thanks Ralph >

Re: [OMPI users] deprecated MCA parameter

2012-08-28 Thread jody
Thanks Ralph I renamed the parameter in my script, and now there are no more ugly messages :) Jody On Tue, Aug 28, 2012 at 3:17 PM, Ralph Castain wrote: > Ah, I see - yeah, the parameter technically is being renamed to > "orte_rsh_agent" to avoid having users need to know

Re: [OMPI users] deprecated MCA parameter

2012-08-28 Thread Ralph Castain
Ah, I see - yeah, the parameter technically is being renamed to "orte_rsh_agent" to avoid having users need to know the internal topology of the code base (i.e., that it is in the plm framework and the rsh component). It will always be there, though - only the name is changing to protect the

Re: [OMPI users] 转发:lwkmpi

2012-08-28 Thread Tim Prince
On 8/28/2012 5:11 AM, 清风 wrote: -- 原始邮 件 -- *发件人:* "295187383"<295187...@qq.com>; *发送时间:* 2012年8月28日(星期二) 下午4:13 *收件人:* "users"; *主题:* lwkmpi Hi everybody, I'm trying compile openmpi with intel compiler11.1.07 on ubuntu . I compiled

Re: [hwloc-users] lstopo and GPus

2012-08-28 Thread Samuel Thibault
Brice Goglin, le Tue 28 Aug 2012 14:43:53 +0200, a écrit : > > $ lstopo > > Socket #0 > > Socket #1 > > PCI... > > (connected to socket #1) > > > > vs > > > > $ lstopo > > Socket #0 > > Socket #1 > > PCI... > > (connected to both sockets) > > Fortunately, this won't occur in most

Re: [hwloc-users] lstopo and GPus

2012-08-28 Thread Brice Goglin
Le 28/08/2012 14:23, Samuel Thibault a écrit : > Gabriele Fatigati, le Tue 28 Aug 2012 14:19:44 +0200, a écrit : >> I'm using hwloc 1.5. I would to see how GPUs are connected with the processor >> socket using lstopo command. > About connexion with the socket, there is indeed no real graphical >

Re: [OMPI users] deprecated MCA parameter

2012-08-28 Thread Ralph Castain
Guess I'm confused - what is the issue here? The param still exists: MCA plm: parameter "plm_rsh_agent" (current value: , data source: default value, synonyms: pls_rsh_agent, orte_rsh_agent) The command used to launch

Re: [hwloc-users] lstopo and GPus

2012-08-28 Thread Samuel Thibault
Gabriele Fatigati, le Tue 28 Aug 2012 14:19:44 +0200, a écrit : > I'm using hwloc 1.5. I would to see how GPUs are connected with the processor > socket using lstopo command.  About connexion with the socket, there is indeed no real graphical difference between "connected to socket #1" and

[hwloc-users] lstopo and GPus

2012-08-28 Thread Gabriele Fatigati
Dear hwloc user, I'm using hwloc 1.5. I would to see how GPUs are connected with the processor socket using lstopo command. I attach the figure. The system has two GPUs, but I don't understand how to find that information from PCI boxes. Thanks in advance. -- Ing. Gabriele Fatigati HPC

Re: [OMPI users] problem with installing open mpi with intelcompiler11.1.07 on ubuntu

2012-08-28 Thread Jeff Squyres (jsquyres)
Try using the 1.6.2 nightly snapshot tarball and see if that fixed your problem. I'm not near a computer to give you the specific link - go to openmpi.org and download and nightly snapshots and the v1.6 series. Sent from my phone. No type good. On Aug 28, 2012, at 6:59 AM, "清风"

[OMPI users] problem with installing open mpi with intelcompiler11.1.07 on ubuntu

2012-08-28 Thread ????
Hi everybody, I'm trying compile openmpi with intel compiler11.1.07 on ubuntu . I compiled openmpi many times and I could always find a problem. But the error that I'm getting now, gives me no clues where to even search for the problem. It seems I have succeed to

[OMPI users] deprecated MCA parameter

2012-08-28 Thread jody
Hi In order to open a xterm for each of my processes i use the MCA parameter 'plm_rsh_agent' like this: mpirun -np 5 -hostfile allhosts-mca plm_base_verbose 1 -mca plm_rsh_agent "ssh -Y" --leave-session-attached xterm -hold -e ./MPIProg Without the option ' -mca plm_rsh_agent "ssh -Y"' i

[OMPI users] ??????lwkmpi

2012-08-28 Thread ????
-- -- ??: "295187383"<295187...@qq.com>; : 2012??8??28??(??) 4:13 ??: "users"; : lwkmpi Hi everybody, I'm trying compile openmpi with intel compiler11.1.07 on ubuntu . I compiled openmpi many

Re: [OMPI users] error compiling openmpi-1.6.1 on Windows 7

2012-08-28 Thread Shiqing Fan
Hi Siegmar, It seems that the runtime environment is messed up with the different versions of Open MPI. I suggest you completely remove all the installations and install 1.6.1 again (just build the installation project again). It should work without any problem under Cygwin too. Shiqing On

Re: [OMPI users] Application with mxm hangs on startup

2012-08-28 Thread ????
Dear prof.Aleksey: My system is 32 bit system for unbuntu.What's used for MXM version which you give to me? Best regards, Liang Wenke -- Original -- From: "Aleksey Senin"; Date: Tue,

Re: [OMPI users] Application with mxm hangs on startup

2012-08-28 Thread ????
Dear prof.Aleksey: Thank you very much. Some failure output file like "config.log","make.out" is in attachment 'lwkmpi.zip'. -- Original -- From: "Aleksey Senin"; Date: Tue, Aug 28, 2012 04:19 PM To:

[OMPI users] Application with mxm hangs on startup

2012-08-28 Thread Aleksey Senin
Please, download MXM version http://mellanox.com/downloads/hpc/mxm/v1.1/mxm_1.1.1328.tar This version checked with OMPI-1.6.2 (http://svn.open-mpi.org/svn/ompi/branches/v1.6). In the case of any failure, could you enclose the output? Regards, Aleksey.

[OMPI users] lwkmpi

2012-08-28 Thread ????
Hi everybody, I'm trying compile openmpi with intel compiler11.1.07 on ubuntu . I compiled openmpi many times and I could always find a problem. But the error that I'm getting now, gives me no clues where to even search for the problem. It seems I have succeed to

Re: [OMPI users] Infiniband performance Problem and stalling

2012-08-28 Thread Paul Kapinos
Randolph, after reading this: On 08/28/12 04:26, Randolph Pullen wrote: - On occasions it seems to stall indefinately, waiting on a single receive. ... I would make a blind guess: are you aware about IB card parameters for registered memory?