Re: [OMPI users] Does Oracle Cluster Tools aka Sun's MPI work with LDAP?

2011-07-20 Thread Paul Kapinos

Hi Terry, Reuti,

good news: we've solved/workarounded the problem with CT/8.2.1c :o)

the "fix" was easy: we used the 64bit version of the 'mpiexec' instead 
of [previously-used as default] 32bit version. The 64bit version version 
works now with both NIS and LDAP autentification modi. The32bit version 
works with the NIS-autentificated part of our cluster, only.


Thanks for your help!

Best wishes
Paul Kapinos



Reuti wrote:

Hi,

Am 15.07.2011 um 21:14 schrieb Terry Dontje:


On 7/15/2011 1:46 PM, Paul Kapinos wrote:
Hi OpenMPI volks (and Oracle/Sun experts), 

we have a problem with Sun's MPI (Cluster Tools 8.2.x) on a part of our cluster. In the part of the cluster where LDAP is activated, the mpiexec  does not try to spawn tasks on remote nodes at all, but exits with an error message alike below. If 'strace -f' the mpiexec, no exec of "ssh" can be found at all. Wondering, mpiexec tries to look into /etc/passwd (where user is not in, because using LDAP!). 

Note this is an area that should be no different than from stock Open MPI. 


"should not" but it is :o)
However, I compare CT/8.2.1c with self-compiled OpenMPI/1.4.3 which are 
far different releases. And they behave definitely in different way: in 
selv-compiled OpenMPI both 32bit and 64bit mpiexecs work with NIS and 
with LDAP, and the CT/8.2.1c mpiexec in 32bit does work with NIS only.





I would suspect that the message might be coming from ssh.  I wouldn't suspect 
mpiexec would be looking into /etc/passwd at all, why would it need to.


the output you listed is titled "[unknown-user]". Maybe referring to the 
password file is a wrong simplification. The test is also on the master node of the 
parallel job by an usual `getpwuid`. The /etc/nsswitch.conf is fine an the `mpiexec` 
machine?

On this node the user is known too? Can they login because they have no 
passphrase or because they have an agent running, or did you setup hostbased 
authentication?


my user is known on each node and is allowed to log in (without 
password) from any to any node. In /etc/passwd there is no password for 
my user; all auth thins are done by NIS or LDAP. (sorry I cannot tell 
more because this is admin stuff, but as said: "ssh" works from any to 
any node without password).
/etc/nsswitch.conf seem to be fine (it works now with the 64bit version 
of mpiexec :o)







 It should just be using ssh.  Can you manually ssh to the same node?
On the old part of the cluster, where NIS is used as the autentification method, Sun MPI runs very fine. 

So, is Suns MPI compatible with LDAP autotentification method at all? 


In as far as whatever launcher you use is compatible with LDAP.
Best wishes, 

Paul 



P.S. in both parts if the cluster, me (login marked as x here) can login to any node by ssh without need to type the password. 



From the headnode of the cluster to a node or also between nodes?


-- Reuti





-- 
The user (x) is unknown to the system (i.e. there is no corresponding 
entry in the password file). Please contact your system administrator 
for a fix. 
-- 
[cluster-beta.rz.RWTH-Aachen.DE:31535] [[57885,0],0] ORTE_ERROR_LOG: Fatal in file plm_rsh_module.c at line 1058 
-- 


--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] Does Oracle Cluster Tools aka Sun's MPI work with LDAP?

2011-07-15 Thread Reuti
Hi,

Am 15.07.2011 um 21:14 schrieb Terry Dontje:

> On 7/15/2011 1:46 PM, Paul Kapinos wrote:
>> Hi OpenMPI volks (and Oracle/Sun experts), 
>> 
>> we have a problem with Sun's MPI (Cluster Tools 8.2.x) on a part of our 
>> cluster. In the part of the cluster where LDAP is activated, the mpiexec  
>> does not try to spawn tasks on remote nodes at all, but exits with an error 
>> message alike below. If 'strace -f' the mpiexec, no exec of "ssh" can be 
>> found at all. Wondering, mpiexec tries to look into /etc/passwd (where user 
>> is not in, because using LDAP!). 
>> 
> Note this is an area that should be no different than from stock Open MPI. 
> I would suspect that the message might be coming from ssh.  I wouldn't 
> suspect mpiexec would be looking into /etc/passwd at all, why would it need 
> to.

the output you listed is titled "[unknown-user]". Maybe referring to the 
password file is a wrong simplification. The test is also on the master node of 
the parallel job by an usual `getpwuid`. The /etc/nsswitch.conf is fine an the 
`mpiexec` machine?

On this node the user is known too? Can they login because they have no 
passphrase or because they have an agent running, or did you setup hostbased 
authentication?


>  It should just be using ssh.  Can you manually ssh to the same node?
>> On the old part of the cluster, where NIS is used as the autentification 
>> method, Sun MPI runs very fine. 
>> 
>> So, is Suns MPI compatible with LDAP autotentification method at all? 
>> 
> In as far as whatever launcher you use is compatible with LDAP.
>> Best wishes, 
>> 
>> Paul 
>> 
>> 
>> P.S. in both parts if the cluster, me (login marked as x here) can login 
>> to any node by ssh without need to type the password. 

>From the headnode of the cluster to a node or also between nodes?

-- Reuti


>> 
>> 
>> 
>> -- 
>> The user (x) is unknown to the system (i.e. there is no corresponding 
>> entry in the password file). Please contact your system administrator 
>> for a fix. 
>> -- 
>> [cluster-beta.rz.RWTH-Aachen.DE:31535] [[57885,0],0] ORTE_ERROR_LOG: Fatal 
>> in file plm_rsh_module.c at line 1058 
>> -- 
>> 
>> 
>> 
>> ___
>> users mailing list
>> 
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> 
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Does Oracle Cluster Tools aka Sun's MPI work with LDAP?

2011-07-15 Thread Terry Dontje



On 7/15/2011 1:46 PM, Paul Kapinos wrote:

Hi OpenMPI volks (and Oracle/Sun experts),

we have a problem with Sun's MPI (Cluster Tools 8.2.x) on a part of 
our cluster. In the part of the cluster where LDAP is activated, the 
mpiexec  does not try to spawn tasks on remote nodes at all, but exits 
with an error message alike below. If 'strace -f' the mpiexec, no exec 
of "ssh" can be found at all. Wondering, mpiexec tries to look into 
/etc/passwd (where user is not in, because using LDAP!).



Note this is an area that should be no different than from stock Open MPI.
I would suspect that the message might be coming from ssh.  I wouldn't 
suspect mpiexec would be looking into /etc/passwd at all, why would it 
need to.  It should just be using ssh.  Can you manually ssh to the same 
node?
On the old part of the cluster, where NIS is used as the 
autentification method, Sun MPI runs very fine.


So, is Suns MPI compatible with LDAP autotentification method at all?


In as far as whatever launcher you use is compatible with LDAP.

Best wishes,

Paul


P.S. in both parts if the cluster, me (login marked as x here) can 
login to any node by ssh without need to type the password.




-- 


The user (x) is unknown to the system (i.e. there is no corresponding
entry in the password file). Please contact your system administrator
for a fix.
-- 

[cluster-beta.rz.RWTH-Aachen.DE:31535] [[57885,0],0] ORTE_ERROR_LOG: 
Fatal in file plm_rsh_module.c at line 1058
-- 





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 





[OMPI users] Does Oracle Cluster Tools aka Sun's MPI work with LDAP?

2011-07-15 Thread Paul Kapinos

Hi OpenMPI volks (and Oracle/Sun experts),

we have a problem with Sun's MPI (Cluster Tools 8.2.x) on a part of our 
cluster. In the part of the cluster where LDAP is activated, the mpiexec 
 does not try to spawn tasks on remote nodes at all, but exits with an 
error message alike below. If 'strace -f' the mpiexec, no exec of "ssh" 
can be found at all. Wondering, mpiexec tries to look into /etc/passwd 
(where user is not in, because using LDAP!).


On the old part of the cluster, where NIS is used as the autentification 
method, Sun MPI runs very fine.


So, is Suns MPI compatible with LDAP autotentification method at all?

Best wishes,

Paul


P.S. in both parts if the cluster, me (login marked as x here) can 
login to any node by ssh without need to type the password.




--
The user (x) is unknown to the system (i.e. there is no corresponding
entry in the password file). Please contact your system administrator
for a fix.
--
[cluster-beta.rz.RWTH-Aachen.DE:31535] [[57885,0],0] ORTE_ERROR_LOG: 
Fatal in file plm_rsh_module.c at line 1058

--


--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915


smime.p7s
Description: S/MIME Cryptographic Signature