Re: [OMPI users] SYSTEM CPU with OpenMPI 1.4.3

2010-11-15 Thread tmishima
Hi, I did the test with the simple program as shown below. (I use mumps, which is a parallel linear solver.) This test program does nothing but just calls intialize & finalize routine of MUMPS & MPI. INCLUDE 'mpif.h' INCLUDE 'dmumps_struc.h' TYPE (DMUMPS_STRUC) MUMPS_PAR

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Reuti
Correction: Am 15.11.2010 um 20:23 schrieb Terry Dontje: > On 11/15/2010 02:11 PM, Reuti wrote: >> Just to give my understanding of the problem: >> >> Am 15.11.2010 um 19:57 schrieb Terry Dontje: >> >> >>> On 11/15/2010 11:08 AM, Chris Jewell wrote: >>> > Sorry, I am still trying to grok

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Reuti
Am 15.11.2010 um 20:23 schrieb Terry Dontje: > >>> Is your complaint really the fact that exec6 has been allocated two slots >>> but there seems to only be one slot worth of resources allocated >>> >> All are wrong except exec6. They should only get one core assigned. >> >> > Huh? I would

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Terry Dontje
On 11/15/2010 02:11 PM, Reuti wrote: Just to give my understanding of the problem: Am 15.11.2010 um 19:57 schrieb Terry Dontje: On 11/15/2010 11:08 AM, Chris Jewell wrote: Sorry, I am still trying to grok all your email as what the problem you are trying to solve. So is the issue is trying

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Reuti
Just to give my understanding of the problem: Am 15.11.2010 um 19:57 schrieb Terry Dontje: > On 11/15/2010 11:08 AM, Chris Jewell wrote: >>> Sorry, I am still trying to grok all your email as what the problem you >>> are trying to solve. So is the issue is trying to have two jobs having >>>

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Terry Dontje
On 11/15/2010 11:08 AM, Chris Jewell wrote: Sorry, I am still trying to grok all your email as what the problem you are trying to solve. So is the issue is trying to have two jobs having processes on the same node be able to bind there processes on different resources. Like core 1 for the first

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Reuti
Hi, Am 15.11.2010 um 17:06 schrieb Chris Jewell: > Hi Ralph, > > Thanks for the tip. With the command > > $ qsub -pe mpi 8 -binding linear:1 myScript.com > > I get the output > > [exec6:29172] System has detected external process binding to cores 0008 > [exec6:29172] ras:gridengine: JOB_ID:

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Chris Jewell
> Sorry, I am still trying to grok all your email as what the problem you > are trying to solve. So is the issue is trying to have two jobs having > processes on the same node be able to bind there processes on different > resources. Like core 1 for the first job and core 2 and 3 for the 2nd

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Chris Jewell
Hi Ralph, Thanks for the tip. With the command $ qsub -pe mpi 8 -binding linear:1 myScript.com I get the output [exec6:29172] System has detected external process binding to cores 0008 [exec6:29172] ras:gridengine: JOB_ID: 59282 [exec6:29172] ras:gridengine: PE_HOSTFILE:

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Ralph Castain
The external binding code should be in that version. If you add --report-bindings --leave-session-attached to the mpirun command line, you should see output from each daemon telling you what external binding it detected, and how it is binding each app it launches. Thanks! On Mon, Nov 15, 2010

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Chris Jewell
> I confess I am now confused. What version of OMPI are you using? > > FWIW: OMPI was updated at some point to detect the actual cores of an > external binding, and abide by them. If we aren't doing that, then we have a > bug that needs to be resolved. Or it could be you are using a version

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Terry Dontje
Sorry, I am still trying to grok all your email as what the problem you are trying to solve. So is the issue is trying to have two jobs having processes on the same node be able to bind there processes on different resources. Like core 1 for the first job and core 2 and 3 for the 2nd job?

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Reuti
Am 15.11.2010 um 15:29 schrieb Chris Jewell: > Hi, > >>> If, indeed, it is not possible currently to implement this type of >>> core-binding in tightly integrated OpenMPI/GE, then a solution might lie in >>> a custom script run in the parallel environment's 'start proc args'. This >>> script

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Chris Jewell
Hi, > > If, indeed, it is not possible currently to implement this type of > > core-binding in tightly integrated OpenMPI/GE, then a solution might lie in > > a custom script run in the parallel environment's 'start proc args'. This > > script would have to find out which slots are allocated

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Ralph Castain
I confess I am now confused. What version of OMPI are you using? FWIW: OMPI was updated at some point to detect the actual cores of an external binding, and abide by them. If we aren't doing that, then we have a bug that needs to be resolved. Or it could be you are using a version that predates

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Reuti
Hi, Am 15.11.2010 um 13:13 schrieb Chris Jewell: > Okay so I tried what you suggested. You essentially get the requested number > of bound cores on each execution node, so if I use > > $ qsub -pe openmpi 8 -binding linear:2 > > then I get 2 bound cores per node, irrespective of the number

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-15 Thread Chris Jewell
Hi Reuti, Okay so I tried what you suggested. You essentially get the requested number of bound cores on each execution node, so if I use $ qsub -pe openmpi 8 -binding linear:2 then I get 2 bound cores per node, irrespective of the number of slots (and hence parallel) processes allocated by