Hmmm...something isn't making sense. Can I see the command line you used to generate this?
I'll tell you why I'm puzzled. If orte_debug_flag is set, then the "--daemonize" should NOT be there, and you should see "--debug" on that command line. What I see is the reverse, which implies to me that orte_debug_flag is NOT being set to "true". When I tested here and on odin, though, I found that the -d option correctly set the flag and everything works just fine. So there is something in your environment or setup that is messing up that orte_debug_flag. I have no idea what it could be - the command line should override anything in your environment, but you could check. Otherwise, if this diagnostic output came from a command line that included -d or --debug-devel, or had OMPI_MCA_orte_debug=1 in the environment, then I am at a loss - everywhere I've tried it, it works fine. Ralph On 4/2/08 5:41 PM, "Jon Mason" <[email protected]> wrote: > On Wednesday 02 April 2008 05:04:47 pm Ralph Castain wrote: >> Here's a real simple diagnostic you can do: set -mca plm_base_verbose 1 and >> look at the cmd line being executed (send it here). It will look like: >> >> [[xxx,1],0] plm:rsh: executing: jjkljks;jldfsaj; >> >> If the cmd line has --daemonize on it, then the ssh will close and xterm >> won't work. > > [vic20:01863] [[40388,0],0] plm:rsh: executing: (//usr/bin/ssh) [/usr/bin/ssh > vic12 orted --daemonize -mca ess env -mca orte_ess_jobid 2646867968 -mca > orte_ess_vpid 1 -mca orte_ess_num_procs > 2 --hnp-uri > "2646867968.0;tcp://192.168.70.150:39057;tcp://10.10.0.150:39057;tcp://86.75.3 > 0.10:39057" --nodename > vic12 -mca btl openib,self --mca btl_openib_receive_queues > P,65536,256,128,128 -mca plm_base_verbose 1 -mca > mca_base_param_file_path > /usr/mpi/gcc/ompi-trunk/share/openmpi/amca-param-sets:/root -mca > mca_base_param_file_path_force /root] > > > It looks like what you say is happening. Is this configured somewhere, so > that I can remove it? > > Thanks, > Jon > >> Ralph >> >> On 4/2/08 3:14 PM, "Jeff Squyres" <[email protected]> wrote: >>> Can you diagnose a little further: >>> >>> 1. in the case where it works, can you verify that the ssh to launch >>> the orteds is still running? >>> >>> 2. in the case where it doesn't work, can you verify that the ssh to >>> launch the orteds has actually died? >>> >>> On Apr 2, 2008, at 4:58 PM, Jon Mason wrote: >>>> On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote: >>>>> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: >>>>>> I remember that someone had found a bug that caused >>>>>> orte_debug_flag to not >>>>>> get properly set (local var covering over a global one) - could be >>>>>> that >>>>>> your tmp-public branch doesn't have that patch in it. >>>>>> >>>>>> You might try updating to the latest trunk >>>>> >>>>> I updated my ompi-trunk tree, did a clean build, and I still seem >>>>> the same >>>>> problem. I regressed trunk to rev 17589 and everything works as I >>>>> expect. >>>>> So I think the problem is still there in the top of trunk. >>>> >>>> I stepped through the revs of trunk and found the first failing rev >>>> to be >>>> 17632. Its a big patch, so I'll defer to those more in the know to >>>> determine >>>> what is breaking in there. >>>> >>>>> I don't discount user error, but I don't think I am doing anyting >>>>> different. >>>>> Did some setting change that perhaps I did not modify? >>>>> >>>>> Thanks, >>>>> Jon >>>>> >>>>>> On 4/2/08 10:41 AM, "George Bosilca" <[email protected]> wrote: >>>>>>> I'm using this feature on the trunk with the version from >>>>>>> yesterday. >>>>>>> It works without problems ... >>>>>>> >>>>>>> george. >>>>>>> >>>>>>> On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: >>>>>>>> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: >>>>>>>>> Are these r numbers relevant on the /tmp-public branch, or the >>>>>>>>> trunk? >>>>>>>> >>>>>>>> I pulled it out of the command used to update the branch, which >>>>>>>> was: >>>>>>>> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . >>>>>>>> >>>>>>>> In the cpc tmp branch, it happened at r17920. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jon >>>>>>>> >>>>>>>>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: >>>>>>>>>> I regressed my tree and it looks like it happened between >>>>>>>>>> 17590:17917 >>>>>>>>>> >>>>>>>>>> On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: >>>>>>>>>>> I am noticing that ssh seems to be broken on trunk (and my cpc >>>>>>>>>>> branch, as >>>>>>>>>>> it is based on trunk). When I try to use xterm and gdb to >>>>>>>>>>> debug, I >>>>>>>>>>> only >>>>>>>>>>> successfully get 1 xterm. I have tried this on 2 different >>>>>>>>>>> setups. I can >>>>>>>>>>> successfully get the xterm's on the 1.2 svn branch. >>>>>>>>>>> >>>>>>>>>>> I am running the following command: >>>>>>>>>>> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e >>>>>>>>>>> gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 >>>>>>>>>>> >>>>>>>>>>> Is anyone else seeing this problem? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Jon >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> devel mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> [email protected] >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> [email protected] >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> [email protected] >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> [email protected] >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> [email protected] >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> [email protected] >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > [email protected] > http://www.open-mpi.org/mailman/listinfo.cgi/devel
