Why do the symbols need to be weak?  Remember that not all platforms support 
weak symbols.

The symbols don't need to be in the executable itself, right?  It should be 
fine for them to be a library (e.g., libopen-rte.so/a).


On Dec 16, 2011, at 4:51 AM, Ashley Pittman wrote:

> 
> Yes, this fixes the issue.
> 
> Ashley.
> 
> On 15 Dec 2011, at 23:42, Nathan Hjelm wrote:
> 
>> I have an idea. How about we set those the MPIR variables as weak. Just 
>> tested it with STAT.
>> 
>> Can you replace orte/tools/orterun/orterun.c with the attached version and 
>> see if it fixes the issue?
>> 
>> -Nathan
>> 
>> On Thu, 15 Dec 2011, Ashley Pittman wrote:
>> 
>>> 
>>> padb just calls gdb, you can see the error using gdb alone using just the 
>>> trace I sent when I started this thread.
>>> 
>>> Perhaps the difference is in versions of gdb, I could give you a login to 
>>> my test machine if you need?
>>> 
>>> Ashley.
>>> 
>>> On 15 Dec 2011, at 22:49, Nathan Hjelm wrote:
>>> 
>>>> Whats odd is totalview, STAT, and GDB see the correct values despite them 
>>>> being in the B section. What does padb do differently?
>>>> 
>>>> This is a dynamic, optimized build of 1.5.5rc1.
>>>> 
>>>> -Nathan Hjelm
>>>> HPC-3, LANL
>>>> 
>>>> On Thu, 15 Dec 2011, Ashley Pittman wrote:
>>>> 
>>>>> 
>>>>> If I add a new symbol to orte/mca/debugger/base/debugger_base_open.c and 
>>>>> declare it in orte/mca/debugger/base/base.h, the same as 
>>>>> MPIR_proctable_size is defined then it appears in the .so but not in the 
>>>>> binary, if I then reference this variable in orte/tools/orterun/orterun.c 
>>>>> the symbol appears in orterun.  It's definably coming from that 
>>>>> declaration, what isn't so clear is how it's getting into the binary.  I 
>>>>> can only assume that orte/mca/debugger/base/debugger_base_fns.c is linked 
>>>>> into the binary directly and the symbol is optimised away in the case 
>>>>> where it's defined but not used.
>>>>> 
>>>>> Ashley.
>>>>> 
>>>>> On 15 Dec 2011, at 22:09, Nathan Hjelm wrote:
>>>>> 
>>>>>> orte/tools/orterun/debuggers.c does not exist anymore (its not in the 
>>>>>> 1.5.5rc1 tarball). I don't know why the symbols are showing up in 
>>>>>> section B of orterun. Investigating now.
>>>>>> 
>>>>>> -Nathan Hjelm
>>>>>> HPC-3, LANL
>>>>>> 
>>>>>> On Thu, 15 Dec 2011, George Bosilca wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Dec 15, 2011, at 16:55 , Ashley Pittman wrote:
>>>>>>> 
>>>>>>>> There is a problem with 1.5.5rc1 that prevents padb from loading the 
>>>>>>>> process table start from the orterun process, what appears to be 
>>>>>>>> happening is that MPIR_proctable and MPIR_proctable_size is present in 
>>>>>>>> both orterun itself and also in libopen-rte.so, the code is correctly 
>>>>>>>> setting them in libopen-rte.so however when gdb is picking the 
>>>>>>>> variable from orterun in preference and hence padb is reading NULL 
>>>>>>>> values.
>>>>>>> 
>>>>>>> Indeed, there are two definitions, but a single declaration. This is 
>>>>>>> true for both the trunk and the 1.5.
>>>>>>> 
>>>>>>> ./orte/mca/debugger/base/base.h:61:ORTE_DECLSPEC extern struct 
>>>>>>> MPIR_PROCDESC *MPIR_proctable;
>>>>>>> ./orte/mca/debugger/base/base.h:62:ORTE_DECLSPEC extern int 
>>>>>>> MPIR_proctable_size;
>>>>>>> 
>>>>>>> ./orte/mca/debugger/base/debugger_base_open.c:42:struct MPIR_PROCDESC 
>>>>>>> *MPIR_proctable = NULL;
>>>>>>> ./orte/mca/debugger/base/debugger_base_open.c:43:int 
>>>>>>> MPIR_proctable_size = 0;
>>>>>>> 
>>>>>>> ./orte/tools/orterun/debuggers.c:142:struct MPIR_PROCDESC 
>>>>>>> *MPIR_proctable = NULL;
>>>>>>> ./orte/tools/orterun/debuggers.c:143:int MPIR_proctable_size = 0;
>>>>>>> 
>>>>>>> george.
>>>>>>> 
>>>>>>> 
>>>>>>>> Attached is a log showing the problem, the only change I made to the 
>>>>>>>> source is to add a call to orte_debugger_base_dump() before the return 
>>>>>>>> from orte_debugger_base_init_after_spawn(), it looks like this could 
>>>>>>>> also have been achieved via a debug setting but I couldn't see how.
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> de...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> 
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> de...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> <orterun.c.gz>_______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to