Yes, this fixes the issue.

Ashley.

On 15 Dec 2011, at 23:42, Nathan Hjelm wrote:

> I have an idea. How about we set those the MPIR variables as weak. Just 
> tested it with STAT.
> 
> Can you replace orte/tools/orterun/orterun.c with the attached version and 
> see if it fixes the issue?
> 
> -Nathan
> 
> On Thu, 15 Dec 2011, Ashley Pittman wrote:
> 
>> 
>> padb just calls gdb, you can see the error using gdb alone using just the 
>> trace I sent when I started this thread.
>> 
>> Perhaps the difference is in versions of gdb, I could give you a login to my 
>> test machine if you need?
>> 
>> Ashley.
>> 
>> On 15 Dec 2011, at 22:49, Nathan Hjelm wrote:
>> 
>>> Whats odd is totalview, STAT, and GDB see the correct values despite them 
>>> being in the B section. What does padb do differently?
>>> 
>>> This is a dynamic, optimized build of 1.5.5rc1.
>>> 
>>> -Nathan Hjelm
>>> HPC-3, LANL
>>> 
>>> On Thu, 15 Dec 2011, Ashley Pittman wrote:
>>> 
>>>> 
>>>> If I add a new symbol to orte/mca/debugger/base/debugger_base_open.c and 
>>>> declare it in orte/mca/debugger/base/base.h, the same as 
>>>> MPIR_proctable_size is defined then it appears in the .so but not in the 
>>>> binary, if I then reference this variable in orte/tools/orterun/orterun.c 
>>>> the symbol appears in orterun.  It's definably coming from that 
>>>> declaration, what isn't so clear is how it's getting into the binary.  I 
>>>> can only assume that orte/mca/debugger/base/debugger_base_fns.c is linked 
>>>> into the binary directly and the symbol is optimised away in the case 
>>>> where it's defined but not used.
>>>> 
>>>> Ashley.
>>>> 
>>>> On 15 Dec 2011, at 22:09, Nathan Hjelm wrote:
>>>> 
>>>>> orte/tools/orterun/debuggers.c does not exist anymore (its not in the 
>>>>> 1.5.5rc1 tarball). I don't know why the symbols are showing up in section 
>>>>> B of orterun. Investigating now.
>>>>> 
>>>>> -Nathan Hjelm
>>>>> HPC-3, LANL
>>>>> 
>>>>> On Thu, 15 Dec 2011, George Bosilca wrote:
>>>>> 
>>>>>> 
>>>>>> On Dec 15, 2011, at 16:55 , Ashley Pittman wrote:
>>>>>> 
>>>>>>> There is a problem with 1.5.5rc1 that prevents padb from loading the 
>>>>>>> process table start from the orterun process, what appears to be 
>>>>>>> happening is that MPIR_proctable and MPIR_proctable_size is present in 
>>>>>>> both orterun itself and also in libopen-rte.so, the code is correctly 
>>>>>>> setting them in libopen-rte.so however when gdb is picking the variable 
>>>>>>> from orterun in preference and hence padb is reading NULL values.
>>>>>> 
>>>>>> Indeed, there are two definitions, but a single declaration. This is 
>>>>>> true for both the trunk and the 1.5.
>>>>>> 
>>>>>> ./orte/mca/debugger/base/base.h:61:ORTE_DECLSPEC extern struct 
>>>>>> MPIR_PROCDESC *MPIR_proctable;
>>>>>> ./orte/mca/debugger/base/base.h:62:ORTE_DECLSPEC extern int 
>>>>>> MPIR_proctable_size;
>>>>>> 
>>>>>> ./orte/mca/debugger/base/debugger_base_open.c:42:struct MPIR_PROCDESC 
>>>>>> *MPIR_proctable = NULL;
>>>>>> ./orte/mca/debugger/base/debugger_base_open.c:43:int MPIR_proctable_size 
>>>>>> = 0;
>>>>>> 
>>>>>> ./orte/tools/orterun/debuggers.c:142:struct MPIR_PROCDESC 
>>>>>> *MPIR_proctable = NULL;
>>>>>> ./orte/tools/orterun/debuggers.c:143:int MPIR_proctable_size = 0;
>>>>>> 
>>>>>> george.
>>>>>> 
>>>>>> 
>>>>>>> Attached is a log showing the problem, the only change I made to the 
>>>>>>> source is to add a call to orte_debugger_base_dump() before the return 
>>>>>>> from orte_debugger_base_init_after_spawn(), it looks like this could 
>>>>>>> also have been achieved via a debug setting but I couldn't see how.
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> de...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> <orterun.c.gz>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to