Hmmm...something isn't making sense. Can I see the command line you used to
generate this?

I'll tell you why I'm puzzled. If orte_debug_flag is set, then the
"--daemonize" should NOT be there, and you should see "--debug" on that
command line. What I see is the reverse, which implies to me that
orte_debug_flag is NOT being set to "true".

When I tested here and on odin, though, I found that the -d option correctly
set the flag and everything works just fine.

So there is something in your environment or setup that is messing up that
orte_debug_flag. I have no idea what it could be - the command line should
override anything in your environment, but you could check. Otherwise, if
this diagnostic output came from a command line that included -d or
--debug-devel, or had OMPI_MCA_orte_debug=1 in the environment, then I am at
a loss - everywhere I've tried it, it works fine.

Ralph



On 4/2/08 5:41 PM, "Jon Mason" <[email protected]> wrote:

> On Wednesday 02 April 2008 05:04:47 pm Ralph Castain wrote:
>> Here's a real simple diagnostic you can do: set -mca plm_base_verbose 1 and
>> look at the cmd line being executed (send it here). It will look like:
>> 
>> [[xxx,1],0] plm:rsh: executing: jjkljks;jldfsaj;
>> 
>> If the cmd line has --daemonize on it, then the ssh will close and xterm
>> won't work.
> 
> [vic20:01863] [[40388,0],0] plm:rsh: executing: (//usr/bin/ssh) [/usr/bin/ssh
> vic12 orted --daemonize -mca ess env -mca orte_ess_jobid 2646867968 -mca
> orte_ess_vpid 1 -mca orte_ess_num_procs
> 2 --hnp-uri 
> "2646867968.0;tcp://192.168.70.150:39057;tcp://10.10.0.150:39057;tcp://86.75.3
> 0.10:39057" --nodename
> vic12 -mca btl openib,self --mca btl_openib_receive_queues
> P,65536,256,128,128 -mca plm_base_verbose 1 -mca
> mca_base_param_file_path
> /usr/mpi/gcc/ompi-trunk/share/openmpi/amca-param-sets:/root -mca
> mca_base_param_file_path_force /root]
> 
> 
> It looks like what you say is happening.  Is this configured somewhere, so
> that I can remove it?
> 
> Thanks,
> Jon
> 
>> Ralph
>> 
>> On 4/2/08 3:14 PM, "Jeff Squyres" <[email protected]> wrote:
>>> Can you diagnose a little further:
>>> 
>>> 1. in the case where it works, can you verify that the ssh to launch
>>> the orteds is still running?
>>> 
>>> 2. in the case where it doesn't work, can you verify that the ssh to
>>> launch the orteds has actually died?
>>> 
>>> On Apr 2, 2008, at 4:58 PM, Jon Mason wrote:
>>>> On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote:
>>>>> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote:
>>>>>> I remember that someone had found a bug that caused
>>>>>> orte_debug_flag to not
>>>>>> get properly set (local var covering over a global one) - could be
>>>>>> that
>>>>>> your tmp-public branch doesn't have that patch in it.
>>>>>> 
>>>>>> You might try updating to the latest trunk
>>>>> 
>>>>> I updated my ompi-trunk tree, did a clean build, and I still seem
>>>>> the same
>>>>> problem.  I regressed trunk to rev 17589 and everything works as I
>>>>> expect.
>>>>> So I think the problem is still there in the top of trunk.
>>>> 
>>>> I stepped through the revs of trunk and found the first failing rev
>>>> to be
>>>> 17632.  Its a big patch, so I'll defer to those more in the know to
>>>> determine
>>>> what is breaking in there.
>>>> 
>>>>> I don't discount user error, but I don't think I am doing anyting
>>>>> different.
>>>>> Did some setting change that perhaps I did not modify?
>>>>> 
>>>>> Thanks,
>>>>> Jon
>>>>> 
>>>>>> On 4/2/08 10:41 AM, "George Bosilca" <[email protected]> wrote:
>>>>>>> I'm using this feature on the trunk with the version from
>>>>>>> yesterday.
>>>>>>> It works without problems ...
>>>>>>> 
>>>>>>>   george.
>>>>>>> 
>>>>>>> On Apr 2, 2008, at 12:14 PM, Jon Mason wrote:
>>>>>>>> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote:
>>>>>>>>> Are these r numbers relevant on the /tmp-public branch, or the
>>>>>>>>> trunk?
>>>>>>>> 
>>>>>>>> I pulled it out of the command used to update the branch, which
>>>>>>>> was:
>>>>>>>> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk .
>>>>>>>> 
>>>>>>>> In the cpc tmp branch, it happened at r17920.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Jon
>>>>>>>> 
>>>>>>>>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote:
>>>>>>>>>> I regressed my tree and it looks like it happened between
>>>>>>>>>> 17590:17917
>>>>>>>>>> 
>>>>>>>>>> On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
>>>>>>>>>>> I am noticing that ssh seems to be broken on trunk (and my cpc
>>>>>>>>>>> branch, as
>>>>>>>>>>> it is based on trunk).  When I try to use xterm and gdb to
>>>>>>>>>>> debug, I
>>>>>>>>>>> only
>>>>>>>>>>> successfully get 1 xterm.  I have tried this on 2 different
>>>>>>>>>>> setups.  I can
>>>>>>>>>>> successfully get the xterm's on the 1.2 svn branch.
>>>>>>>>>>> 
>>>>>>>>>>> I am running the following command:
>>>>>>>>>>> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e
>>>>>>>>>>> gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1
>>>>>>>>>>> 
>>>>>>>>>>> Is anyone else seeing this problem?
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jon
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> devel mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> devel mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> [email protected]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> [email protected]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> 
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> [email protected]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> [email protected]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> [email protected]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> _______________________________________________
>> devel mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to