On Wednesday 02 April 2008 05:04:47 pm Ralph Castain wrote: > Here's a real simple diagnostic you can do: set -mca plm_base_verbose 1 and > look at the cmd line being executed (send it here). It will look like: > > [[xxx,1],0] plm:rsh: executing: jjkljks;jldfsaj; > > If the cmd line has --daemonize on it, then the ssh will close and xterm > won't work.
[vic20:01863] [[40388,0],0] plm:rsh: executing: (//usr/bin/ssh) [/usr/bin/ssh vic12 orted --daemonize -mca ess env -mca orte_ess_jobid 2646867968 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri "2646867968.0;tcp://192.168.70.150:39057;tcp://10.10.0.150:39057;tcp://86.75.30.10:39057" --nodename vic12 -mca btl openib,self --mca btl_openib_receive_queues P,65536,256,128,128 -mca plm_base_verbose 1 -mca mca_base_param_file_path /usr/mpi/gcc/ompi-trunk/share/openmpi/amca-param-sets:/root -mca mca_base_param_file_path_force /root] It looks like what you say is happening. Is this configured somewhere, so that I can remove it? Thanks, Jon > Ralph > > On 4/2/08 3:14 PM, "Jeff Squyres" <jsquy...@cisco.com> wrote: > > Can you diagnose a little further: > > > > 1. in the case where it works, can you verify that the ssh to launch > > the orteds is still running? > > > > 2. in the case where it doesn't work, can you verify that the ssh to > > launch the orteds has actually died? > > > > On Apr 2, 2008, at 4:58 PM, Jon Mason wrote: > >> On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote: > >>> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: > >>>> I remember that someone had found a bug that caused > >>>> orte_debug_flag to not > >>>> get properly set (local var covering over a global one) - could be > >>>> that > >>>> your tmp-public branch doesn't have that patch in it. > >>>> > >>>> You might try updating to the latest trunk > >>> > >>> I updated my ompi-trunk tree, did a clean build, and I still seem > >>> the same > >>> problem. I regressed trunk to rev 17589 and everything works as I > >>> expect. > >>> So I think the problem is still there in the top of trunk. > >> > >> I stepped through the revs of trunk and found the first failing rev > >> to be > >> 17632. Its a big patch, so I'll defer to those more in the know to > >> determine > >> what is breaking in there. > >> > >>> I don't discount user error, but I don't think I am doing anyting > >>> different. > >>> Did some setting change that perhaps I did not modify? > >>> > >>> Thanks, > >>> Jon > >>> > >>>> On 4/2/08 10:41 AM, "George Bosilca" <bosi...@eecs.utk.edu> wrote: > >>>>> I'm using this feature on the trunk with the version from > >>>>> yesterday. > >>>>> It works without problems ... > >>>>> > >>>>> george. > >>>>> > >>>>> On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: > >>>>>> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: > >>>>>>> Are these r numbers relevant on the /tmp-public branch, or the > >>>>>>> trunk? > >>>>>> > >>>>>> I pulled it out of the command used to update the branch, which > >>>>>> was: > >>>>>> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . > >>>>>> > >>>>>> In the cpc tmp branch, it happened at r17920. > >>>>>> > >>>>>> Thanks, > >>>>>> Jon > >>>>>> > >>>>>>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: > >>>>>>>> I regressed my tree and it looks like it happened between > >>>>>>>> 17590:17917 > >>>>>>>> > >>>>>>>> On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: > >>>>>>>>> I am noticing that ssh seems to be broken on trunk (and my cpc > >>>>>>>>> branch, as > >>>>>>>>> it is based on trunk). When I try to use xterm and gdb to > >>>>>>>>> debug, I > >>>>>>>>> only > >>>>>>>>> successfully get 1 xterm. I have tried this on 2 different > >>>>>>>>> setups. I can > >>>>>>>>> successfully get the xterm's on the 1.2 svn branch. > >>>>>>>>> > >>>>>>>>> I am running the following command: > >>>>>>>>> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e > >>>>>>>>> gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 > >>>>>>>>> > >>>>>>>>> Is anyone else seeing this problem? > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Jon > >>>>>>>>> _______________________________________________ > >>>>>>>>> devel mailing list > >>>>>>>>> de...@open-mpi.org > >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> devel mailing list > >>>>>>>> de...@open-mpi.org > >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>>>>> > >>>>>> _______________________________________________ > >>>>>> devel mailing list > >>>>>> de...@open-mpi.org > >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>>>> > >>>>> _______________________________________________ > >>>>> devel mailing list > >>>>> de...@open-mpi.org > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>>> > >>>> _______________________________________________ > >>>> devel mailing list > >>>> de...@open-mpi.org > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>> > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel