Re: [O-MPI devel] rsh and fork pls components

2005-12-13 Thread Jeff Squyres

On Dec 13, 2005, at 4:48 PM, Jeff Squyres wrote:


On the orted check for the fork pls, you will find that there is a
flag in the process info structure that indicates "I am a daemon".
You may just need to check that flag - gets set very early and so
should be available in time for this purpose.


I should have read that more carefully -- you're saying that that  
flag *is* in liborte, and is not a symbol that only exists in the orted.


Hence, that is perfect.

Thanks!  :-)

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/





Re: [O-MPI devel] rsh and fork pls components

2005-12-13 Thread Jeff Squyres

On Dec 13, 2005, at 4:45 PM, Ralph H. Castain wrote:

No problem with me - seems straightforward and resolves some  
confusion.


Cool.


On the orted check for the fork pls, you will find that there is a
flag in the process info structure that indicates "I am a daemon".
You may just need to check that flag - gets set very early and so
should be available in time for this purpose.


I can't do that because that means that that symbol will need to be  
linked into the rsh pls component -- that symbol only exists in the  
orted, so this would be problematic.


Is there a liborte function that I can call that indicates whether  
the current process is an orted or not?  Or an environment variable  
that only exists for orted's?


--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/





Re: [O-MPI devel] rsh and fork pls components

2005-12-13 Thread Ralph H. Castain

No problem with me - seems straightforward and resolves some confusion.

On the orted check for the fork pls, you will find that there is a 
flag in the process info structure that indicates "I am a daemon". 
You may just need to check that flag - gets set very early and so 
should be available in time for this purpose.



At 02:06 PM 12/13/2005, you wrote:

I'd like to suggest a change for the rsh and fork pls components
based on some real-world feedback.

The rsh pls, despite its name, defaults to using "ssh -x" instead of
"rsh".

For users who do not have ssh in their PATH (e.g., for clusters that
are walled off from the rest of the net and only have rsh), the rsh
pls component will disqualify itself during selection and therefore
the "fork" pls will get selected, which will fail when it tries to
launch for a variety of reasons (it's only designed to work within
the orted).

1. I'd like to change the meaning of the pls_rsh_agent MCA parameter
to be a colon-delimited list of agents to search for.  This is still
backwards-compatible -- if someone does the following:

mpirun --mca pla_rsh_agent rsh -np 4 a.out

That also still works.  But we might want to extend the default value
from:

ssh -x
to:
ssh -x : rsh

So that if "ssh" is not found in the $PATH, we'll then try to find
"rsh" and use that if it's found.

If the rsh pls cannot find any of the agents in the list, then it
should disqualify itself from selection.

2. I'd like to add some kind of check to the fork pls so that it
never allows itself to be selected outside of the orted.  I'm not
sure what that check would entail (haven't looked at the code yet),
but we should prevent this situation because we know it will fail
(and currently produce a cryptic error message for the user).

I'd like to get both of these in for v1.0.2.

Comments?

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel