Re: [OMPI devel] accessors to context id and message id's

2007-11-05 Thread George Bosilca
If I understand correctly your question, then we don't need any  
extension. Each request has a unique ID (from PERUSE perspective).  
However, if I remember well this is only half implemented in our  
PERUSE layer (i.e. it works only for expected requests). This should  
be quite easy to fix, if someone invest few hours into it.


For the context id, a user can always use the c2f function to get the  
fortran ID (which for Open MPI is the communicator ID).


  Thanks,
george.

On Nov 5, 2007, at 8:01 AM, Terry Dontje wrote:

Currently in order to do message tracing one either has to rely on  
some
error prone postprocessing of data or replicating some MPI internals  
up

in the PMPI layer.  It would help Sun's tools group (and I believe U
Dresden also) if Open MPI would create a couple APIs that exoposed the
following:

1. PML Message ids used for a request
2. Context id for a specific communicator

I could see a couple ways of providing this information.  Either by
extending the PERUSE probes or creating actual functions that one  
would

pass in a request handle or communicator handle to get the appropriate
data back.

This is just a thought right now which why this email is not in an RFC
format.  I wanted to get a feel from the community as to the  
interest in
such APIs and if anyone may have specific issues with us providing  
such
interfaces.  If the responses seems positive I will follow this  
message

up with an RFC.

thanks,

--td
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Environment forwarding

2007-11-05 Thread Tim Prins
Thanks for the clarification everyone.

Tim

On Monday 05 November 2007 05:41:00 pm Torsten Hoefler wrote:
> On Mon, Nov 05, 2007 at 05:32:04PM -0500, Brian W. Barrett wrote:
> > On Mon, 5 Nov 2007, Torsten Hoefler wrote:
> > > On Mon, Nov 05, 2007 at 04:57:19PM -0500, Brian W. Barrett wrote:
> > >> This is extremely tricky to do.  How do you know which environment
> > >> variables to forward (foo in this case) and which not to (hostname).
> > >> SLURM has a better chance, since it's linux only and generally only
> > >> run on tightly controlled clusters.  But there's a whole variety of
> > >> things that shouldn't be forwarded and that list differs from OS to
> > >> OS.
> > >>
> > >> I believe we toyed around with the "right thing" in LAM and early on
> > >> with OPen MPI and decided that it was too hard to meet expected
> > >> behavior.
> > >
> > > Some applications rely on this (I know at least two right away, Gamess
> > > and Abinit) and they work without problems with Lam/Mpich{1,2} but not
> > > with Open MPI. I am *not* arguing that those applications are correct
> > > (I agree that this way of passing arguments is ugly, but it's done).
> > >
> > > I know it's not defined in the standard but I think it's a nice
> > > convenient functionality. E.g., setting the LD_LIBRARY_PATH to find
> > > libmpi.so in the .bashrc is also a pain if you have multiple (Open)
> > > MPIs installed.
> >
> > LAM does not automatically propogate environment variables -- it's
> > behavior is almost *exactly* like Open MPI's.  There might be a situation
> > where the environment is not quite so scrubbed if a process is started on
> > the same node mpirun is executed on, but it's only appearances -- in
> > reality, that's the environment that was alive when lamboot was executed.
>
> ok, I might have executed it on the same node (was a while ago).
>
> > With both LAM and Open MPI, there is the -x option to propogate a list of
> > environment variables, but that's about it.  Neither will push
> > LD_LIBRARY_PATH by default (and there are many good reasons for that,
> > particularly in heterogeneous situations).
>
> Ah, heterogeneous! Yes, I agree.
>
> Torsten




Re: [OMPI devel] Environment forwarding

2007-11-05 Thread Torsten Hoefler
On Mon, Nov 05, 2007 at 04:57:19PM -0500, Brian W. Barrett wrote:
> This is extremely tricky to do.  How do you know which environment 
> variables to forward (foo in this case) and which not to (hostname). 
> SLURM has a better chance, since it's linux only and generally only run on 
> tightly controlled clusters.  But there's a whole variety of things that 
> shouldn't be forwarded and that list differs from OS to OS.
> 
> I believe we toyed around with the "right thing" in LAM and early on with 
> OPen MPI and decided that it was too hard to meet expected behavior.
Some applications rely on this (I know at least two right away, Gamess
and Abinit) and they work without problems with Lam/Mpich{1,2} but not
with Open MPI. I am *not* arguing that those applications are correct (I
agree that this way of passing arguments is ugly, but it's done). 

I know it's not defined in the standard but I think it's a nice
convenient functionality. E.g., setting the LD_LIBRARY_PATH to find
libmpi.so in the .bashrc is also a pain if you have multiple (Open) MPIs
installed.


Just my two cents,
  Torsten

-- 
 bash$ :(){ :|:&};: - http://www.unixer.de/ -
"You should never bet against anything in science at odds of more than
about 10^12 to 1." Ernest Rutherford


Re: [OMPI devel] Environment forwarding

2007-11-05 Thread Brian W. Barrett
This is extremely tricky to do.  How do you know which environment 
variables to forward (foo in this case) and which not to (hostname). 
SLURM has a better chance, since it's linux only and generally only run on 
tightly controlled clusters.  But there's a whole variety of things that 
shouldn't be forwarded and that list differs from OS to OS.


I believe we toyed around with the "right thing" in LAM and early on with 
OPen MPI and decided that it was too hard to meet expected behavior.


Brian

On Mon, 5 Nov 2007, Tim Prins wrote:


Hi,

After talking with Torsten today I found something weird. When using the SLURM
pls we seem to forward a user's environment, but when using the rsh pls we do
not.

I.e.:
[tprins@odin ~]$ mpirun -np 1 printenv |grep foo
[tprins@odin ~]$ export foo=bar
[tprins@odin ~]$ mpirun -np 1 printenv |grep foo
foo=bar
[tprins@odin ~]$ mpirun -np 1 -mca pls rsh printenv |grep foo

So my question is which is the expected behavior?

I don't think we can do anything about SLURM automatically forwarding the
environment, but I think there should be a way to make rsh forward the
environment. Perhaps add a flag to mpirun to do this?

Thanks,

Tim
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



[OMPI devel] Environment forwarding

2007-11-05 Thread Tim Prins
Hi,

After talking with Torsten today I found something weird. When using the SLURM 
pls we seem to forward a user's environment, but when using the rsh pls we do 
not.

I.e.:
[tprins@odin ~]$ mpirun -np 1 printenv |grep foo
[tprins@odin ~]$ export foo=bar
[tprins@odin ~]$ mpirun -np 1 printenv |grep foo
foo=bar
[tprins@odin ~]$ mpirun -np 1 -mca pls rsh printenv |grep foo

So my question is which is the expected behavior? 

I don't think we can do anything about SLURM automatically forwarding the 
environment, but I think there should be a way to make rsh forward the 
environment. Perhaps add a flag to mpirun to do this?

Thanks,

Tim


[OMPI devel] Multiworld MCA parameter values broken

2007-11-05 Thread Tim Prins

Hi,

Commit 16364 broke things when using multiword mca param values. For 
instance:


mpirun --debug-daemons -mca orte_debug 1 -mca pls rsh -mca pls_rsh_agent 
"ssh -Y" xterm


Will crash and burn, because the value "ssh -Y" is being stored into the 
argv orted_cmd_line in orterun.c:1506. This is then added to the launch 
command for the orted:


/usr/bin/ssh -Y odin004  PATH=/san/homedirs/tprins/usr/rsl/bin:$PATH ; 
export PATH ; 
LD_LIBRARY_PATH=/san/homedirs/tprins/usr/rsl/lib:$LD_LIBRARY_PATH ; 
export LD_LIBRARY_PATH ; /san/homedirs/tprins/usr/rsl/bin/orted --debug 
--debug-daemons --name 0.1 --num_procs 2 --vpid_start 0 --nodename 
odin004 --universe tpr...@odin.cs.indiana.edu:default-universe-27872 
--nsreplica 
"0.0;tcp://129.79.240.100:40907;tcp6://2001:18e8:2:240:2e0:81ff:fe2d:21a0:40908" 
--gprreplica 
"0.0;tcp://129.79.240.100:40907;tcp6://2001:18e8:2:240:2e0:81ff:fe2d:21a0:40908" 
-mca orte_debug 1 -mca pls_rsh_agent ssh -Y -mca 
mca_base_param_file_path 
/u/tprins/usr/rsl/share/openmpi/amca-param-sets:/san/homedirs/tprins/rsl/examples 
-mca mca_base_param_file_path_force /san/homedirs/tprins/rsl/examples


Notice that in this command we now have "-mca pls_rsh_agent ssh -Y". So 
the quotes have been lost, as we die a horrible death.


So we need to add the quotes back in somehow, or pass these options 
differently. I'm not sure what the best way to fix this.


Thanks,

Tim


[OMPI devel] accessors to context id and message id's

2007-11-05 Thread Terry Dontje
Currently in order to do message tracing one either has to rely on some 
error prone postprocessing of data or replicating some MPI internals up 
in the PMPI layer.  It would help Sun's tools group (and I believe U 
Dresden also) if Open MPI would create a couple APIs that exoposed the 
following:


1. PML Message ids used for a request
2. Context id for a specific communicator

I could see a couple ways of providing this information.  Either by 
extending the PERUSE probes or creating actual functions that one would 
pass in a request handle or communicator handle to get the appropriate 
data back.


This is just a thought right now which why this email is not in an RFC 
format.  I wanted to get a feel from the community as to the interest in 
such APIs and if anyone may have specific issues with us providing such 
interfaces.  If the responses seems positive I will follow this message 
up with an RFC.


thanks,

--td