Re: [OMPI devel] accessors to context id and message id's
If I understand correctly your question, then we don't need any extension. Each request has a unique ID (from PERUSE perspective). However, if I remember well this is only half implemented in our PERUSE layer (i.e. it works only for expected requests). This should be quite easy to fix, if someone invest few hours into it. For the context id, a user can always use the c2f function to get the fortran ID (which for Open MPI is the communicator ID). Thanks, george. On Nov 5, 2007, at 8:01 AM, Terry Dontje wrote: Currently in order to do message tracing one either has to rely on some error prone postprocessing of data or replicating some MPI internals up in the PMPI layer. It would help Sun's tools group (and I believe U Dresden also) if Open MPI would create a couple APIs that exoposed the following: 1. PML Message ids used for a request 2. Context id for a specific communicator I could see a couple ways of providing this information. Either by extending the PERUSE probes or creating actual functions that one would pass in a request handle or communicator handle to get the appropriate data back. This is just a thought right now which why this email is not in an RFC format. I wanted to get a feel from the community as to the interest in such APIs and if anyone may have specific issues with us providing such interfaces. If the responses seems positive I will follow this message up with an RFC. thanks, --td ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI devel] Environment forwarding
Thanks for the clarification everyone. Tim On Monday 05 November 2007 05:41:00 pm Torsten Hoefler wrote: > On Mon, Nov 05, 2007 at 05:32:04PM -0500, Brian W. Barrett wrote: > > On Mon, 5 Nov 2007, Torsten Hoefler wrote: > > > On Mon, Nov 05, 2007 at 04:57:19PM -0500, Brian W. Barrett wrote: > > >> This is extremely tricky to do. How do you know which environment > > >> variables to forward (foo in this case) and which not to (hostname). > > >> SLURM has a better chance, since it's linux only and generally only > > >> run on tightly controlled clusters. But there's a whole variety of > > >> things that shouldn't be forwarded and that list differs from OS to > > >> OS. > > >> > > >> I believe we toyed around with the "right thing" in LAM and early on > > >> with OPen MPI and decided that it was too hard to meet expected > > >> behavior. > > > > > > Some applications rely on this (I know at least two right away, Gamess > > > and Abinit) and they work without problems with Lam/Mpich{1,2} but not > > > with Open MPI. I am *not* arguing that those applications are correct > > > (I agree that this way of passing arguments is ugly, but it's done). > > > > > > I know it's not defined in the standard but I think it's a nice > > > convenient functionality. E.g., setting the LD_LIBRARY_PATH to find > > > libmpi.so in the .bashrc is also a pain if you have multiple (Open) > > > MPIs installed. > > > > LAM does not automatically propogate environment variables -- it's > > behavior is almost *exactly* like Open MPI's. There might be a situation > > where the environment is not quite so scrubbed if a process is started on > > the same node mpirun is executed on, but it's only appearances -- in > > reality, that's the environment that was alive when lamboot was executed. > > ok, I might have executed it on the same node (was a while ago). > > > With both LAM and Open MPI, there is the -x option to propogate a list of > > environment variables, but that's about it. Neither will push > > LD_LIBRARY_PATH by default (and there are many good reasons for that, > > particularly in heterogeneous situations). > > Ah, heterogeneous! Yes, I agree. > > Torsten
Re: [OMPI devel] Environment forwarding
On Mon, Nov 05, 2007 at 04:57:19PM -0500, Brian W. Barrett wrote: > This is extremely tricky to do. How do you know which environment > variables to forward (foo in this case) and which not to (hostname). > SLURM has a better chance, since it's linux only and generally only run on > tightly controlled clusters. But there's a whole variety of things that > shouldn't be forwarded and that list differs from OS to OS. > > I believe we toyed around with the "right thing" in LAM and early on with > OPen MPI and decided that it was too hard to meet expected behavior. Some applications rely on this (I know at least two right away, Gamess and Abinit) and they work without problems with Lam/Mpich{1,2} but not with Open MPI. I am *not* arguing that those applications are correct (I agree that this way of passing arguments is ugly, but it's done). I know it's not defined in the standard but I think it's a nice convenient functionality. E.g., setting the LD_LIBRARY_PATH to find libmpi.so in the .bashrc is also a pain if you have multiple (Open) MPIs installed. Just my two cents, Torsten -- bash$ :(){ :|:&};: - http://www.unixer.de/ - "You should never bet against anything in science at odds of more than about 10^12 to 1." Ernest Rutherford
Re: [OMPI devel] Environment forwarding
This is extremely tricky to do. How do you know which environment variables to forward (foo in this case) and which not to (hostname). SLURM has a better chance, since it's linux only and generally only run on tightly controlled clusters. But there's a whole variety of things that shouldn't be forwarded and that list differs from OS to OS. I believe we toyed around with the "right thing" in LAM and early on with OPen MPI and decided that it was too hard to meet expected behavior. Brian On Mon, 5 Nov 2007, Tim Prins wrote: Hi, After talking with Torsten today I found something weird. When using the SLURM pls we seem to forward a user's environment, but when using the rsh pls we do not. I.e.: [tprins@odin ~]$ mpirun -np 1 printenv |grep foo [tprins@odin ~]$ export foo=bar [tprins@odin ~]$ mpirun -np 1 printenv |grep foo foo=bar [tprins@odin ~]$ mpirun -np 1 -mca pls rsh printenv |grep foo So my question is which is the expected behavior? I don't think we can do anything about SLURM automatically forwarding the environment, but I think there should be a way to make rsh forward the environment. Perhaps add a flag to mpirun to do this? Thanks, Tim ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Environment forwarding
Hi, After talking with Torsten today I found something weird. When using the SLURM pls we seem to forward a user's environment, but when using the rsh pls we do not. I.e.: [tprins@odin ~]$ mpirun -np 1 printenv |grep foo [tprins@odin ~]$ export foo=bar [tprins@odin ~]$ mpirun -np 1 printenv |grep foo foo=bar [tprins@odin ~]$ mpirun -np 1 -mca pls rsh printenv |grep foo So my question is which is the expected behavior? I don't think we can do anything about SLURM automatically forwarding the environment, but I think there should be a way to make rsh forward the environment. Perhaps add a flag to mpirun to do this? Thanks, Tim
[OMPI devel] Multiworld MCA parameter values broken
Hi, Commit 16364 broke things when using multiword mca param values. For instance: mpirun --debug-daemons -mca orte_debug 1 -mca pls rsh -mca pls_rsh_agent "ssh -Y" xterm Will crash and burn, because the value "ssh -Y" is being stored into the argv orted_cmd_line in orterun.c:1506. This is then added to the launch command for the orted: /usr/bin/ssh -Y odin004 PATH=/san/homedirs/tprins/usr/rsl/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/san/homedirs/tprins/usr/rsl/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; /san/homedirs/tprins/usr/rsl/bin/orted --debug --debug-daemons --name 0.1 --num_procs 2 --vpid_start 0 --nodename odin004 --universe tpr...@odin.cs.indiana.edu:default-universe-27872 --nsreplica "0.0;tcp://129.79.240.100:40907;tcp6://2001:18e8:2:240:2e0:81ff:fe2d:21a0:40908" --gprreplica "0.0;tcp://129.79.240.100:40907;tcp6://2001:18e8:2:240:2e0:81ff:fe2d:21a0:40908" -mca orte_debug 1 -mca pls_rsh_agent ssh -Y -mca mca_base_param_file_path /u/tprins/usr/rsl/share/openmpi/amca-param-sets:/san/homedirs/tprins/rsl/examples -mca mca_base_param_file_path_force /san/homedirs/tprins/rsl/examples Notice that in this command we now have "-mca pls_rsh_agent ssh -Y". So the quotes have been lost, as we die a horrible death. So we need to add the quotes back in somehow, or pass these options differently. I'm not sure what the best way to fix this. Thanks, Tim
[OMPI devel] accessors to context id and message id's
Currently in order to do message tracing one either has to rely on some error prone postprocessing of data or replicating some MPI internals up in the PMPI layer. It would help Sun's tools group (and I believe U Dresden also) if Open MPI would create a couple APIs that exoposed the following: 1. PML Message ids used for a request 2. Context id for a specific communicator I could see a couple ways of providing this information. Either by extending the PERUSE probes or creating actual functions that one would pass in a request handle or communicator handle to get the appropriate data back. This is just a thought right now which why this email is not in an RFC format. I wanted to get a feel from the community as to the interest in such APIs and if anyone may have specific issues with us providing such interfaces. If the responses seems positive I will follow this message up with an RFC. thanks, --td