Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Ralph Castain

On Mar 21, 2011, at 11:12 AM, Dave Love wrote:

> Ralph Castain  writes:
> 
>> Just looking at this for another question. Yes, SGE integration is broken in 
>> 1.5. Looking at how to fix now.
>> 
>> Meantime, you can get it work by adding "-mca plm ^rshd" to your mpirun cmd 
>> line.
> 
> Thanks.  I'd forgotten about plm when checking, though I guess that
> wouldn't have helped me.
> 
> Should rshd be mentioned in the release notes?

Just starting the discussion on the best solution going forward. I'd rather not 
have to tell SGE users to add this to their cmd line. :-(

> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Dave Love
Ralph Castain  writes:

> Just looking at this for another question. Yes, SGE integration is broken in 
> 1.5. Looking at how to fix now.
>
> Meantime, you can get it work by adding "-mca plm ^rshd" to your mpirun cmd 
> line.

Thanks.  I'd forgotten about plm when checking, though I guess that
wouldn't have helped me.

Should rshd be mentioned in the release notes?



Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Ralph Castain
Just looking at this for another question. Yes, SGE integration is broken in 
1.5. Looking at how to fix now.

Meantime, you can get it work by adding "-mca plm ^rshd" to your mpirun cmd 
line.


On Mar 21, 2011, at 9:47 AM, Dave Love wrote:

> Terry Dontje  writes:
> 
>> Dave what version of Grid Engine are you using?
> 
> 6.2u5, plus irrelevant patches.  It's fine with ompi 1.4.  (All I did to
> switch was to load the 1.5.3 modules environment.)
> 
>> The plm checks for the following env-var's to determine if you are
>> running Grid Engine.
>> SGE_ROOT
>> ARC
>> PE_HOSTFILE
>> JOB_ID
>> 
>> If these are not there during the session that mpirun is executed then
>> it will resort to ssh.
> 
> Sure.  What ras_gridengine_debug reported looked correct.  I'll try to
> debug it.  At least I stand a reasonable chance with grid engine issues.
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Dave Love
Terry Dontje  writes:

> Dave what version of Grid Engine are you using?

6.2u5, plus irrelevant patches.  It's fine with ompi 1.4.  (All I did to
switch was to load the 1.5.3 modules environment.)

> The plm checks for the following env-var's to determine if you are
> running Grid Engine.
> SGE_ROOT
> ARC
> PE_HOSTFILE
> JOB_ID
>
> If these are not there during the session that mpirun is executed then
> it will resort to ssh.

Sure.  What ras_gridengine_debug reported looked correct.  I'll try to
debug it.  At least I stand a reasonable chance with grid engine issues.



Re: [OMPI users] 1.5.3 and SGE integration?

2011-03-21 Thread Terry Dontje

Dave what version of Grid Engine are you using?
The plm checks for the following env-var's to determine if you are 
running Grid Engine.

SGE_ROOT
ARC
PE_HOSTFILE
JOB_ID

If these are not there during the session that mpirun is executed then 
it will resort to ssh.


--td


On 03/21/2011 08:24 AM, Dave Love wrote:

I've just tried 1.5.3 under SGE with tight integration, which seems to
be broken.  I built and ran in the same way as for 1.4.{1,3}, which
works, and ompi_info reports the same gridengine parameters for 1.5 as
for 1.4.

The symptoms are that it reports a failure to communicate using ssh,
whereas it should be using the SGE builtin method via qrsh.

There doesn't seem to be a relevant bug report, but before I
investigate, has anyone else succeeded/failed with it, or have any
hints?

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com