Yes, this fixes the issue. Ashley.
On 15 Dec 2011, at 23:42, Nathan Hjelm wrote: > I have an idea. How about we set those the MPIR variables as weak. Just > tested it with STAT. > > Can you replace orte/tools/orterun/orterun.c with the attached version and > see if it fixes the issue? > > -Nathan > > On Thu, 15 Dec 2011, Ashley Pittman wrote: > >> >> padb just calls gdb, you can see the error using gdb alone using just the >> trace I sent when I started this thread. >> >> Perhaps the difference is in versions of gdb, I could give you a login to my >> test machine if you need? >> >> Ashley. >> >> On 15 Dec 2011, at 22:49, Nathan Hjelm wrote: >> >>> Whats odd is totalview, STAT, and GDB see the correct values despite them >>> being in the B section. What does padb do differently? >>> >>> This is a dynamic, optimized build of 1.5.5rc1. >>> >>> -Nathan Hjelm >>> HPC-3, LANL >>> >>> On Thu, 15 Dec 2011, Ashley Pittman wrote: >>> >>>> >>>> If I add a new symbol to orte/mca/debugger/base/debugger_base_open.c and >>>> declare it in orte/mca/debugger/base/base.h, the same as >>>> MPIR_proctable_size is defined then it appears in the .so but not in the >>>> binary, if I then reference this variable in orte/tools/orterun/orterun.c >>>> the symbol appears in orterun. It's definably coming from that >>>> declaration, what isn't so clear is how it's getting into the binary. I >>>> can only assume that orte/mca/debugger/base/debugger_base_fns.c is linked >>>> into the binary directly and the symbol is optimised away in the case >>>> where it's defined but not used. >>>> >>>> Ashley. >>>> >>>> On 15 Dec 2011, at 22:09, Nathan Hjelm wrote: >>>> >>>>> orte/tools/orterun/debuggers.c does not exist anymore (its not in the >>>>> 1.5.5rc1 tarball). I don't know why the symbols are showing up in section >>>>> B of orterun. Investigating now. >>>>> >>>>> -Nathan Hjelm >>>>> HPC-3, LANL >>>>> >>>>> On Thu, 15 Dec 2011, George Bosilca wrote: >>>>> >>>>>> >>>>>> On Dec 15, 2011, at 16:55 , Ashley Pittman wrote: >>>>>> >>>>>>> There is a problem with 1.5.5rc1 that prevents padb from loading the >>>>>>> process table start from the orterun process, what appears to be >>>>>>> happening is that MPIR_proctable and MPIR_proctable_size is present in >>>>>>> both orterun itself and also in libopen-rte.so, the code is correctly >>>>>>> setting them in libopen-rte.so however when gdb is picking the variable >>>>>>> from orterun in preference and hence padb is reading NULL values. >>>>>> >>>>>> Indeed, there are two definitions, but a single declaration. This is >>>>>> true for both the trunk and the 1.5. >>>>>> >>>>>> ./orte/mca/debugger/base/base.h:61:ORTE_DECLSPEC extern struct >>>>>> MPIR_PROCDESC *MPIR_proctable; >>>>>> ./orte/mca/debugger/base/base.h:62:ORTE_DECLSPEC extern int >>>>>> MPIR_proctable_size; >>>>>> >>>>>> ./orte/mca/debugger/base/debugger_base_open.c:42:struct MPIR_PROCDESC >>>>>> *MPIR_proctable = NULL; >>>>>> ./orte/mca/debugger/base/debugger_base_open.c:43:int MPIR_proctable_size >>>>>> = 0; >>>>>> >>>>>> ./orte/tools/orterun/debuggers.c:142:struct MPIR_PROCDESC >>>>>> *MPIR_proctable = NULL; >>>>>> ./orte/tools/orterun/debuggers.c:143:int MPIR_proctable_size = 0; >>>>>> >>>>>> george. >>>>>> >>>>>> >>>>>>> Attached is a log showing the problem, the only change I made to the >>>>>>> source is to add a call to orte_debugger_base_dump() before the return >>>>>>> from orte_debugger_base_init_after_spawn(), it looks like this could >>>>>>> also have been achieved via a debug setting but I couldn't see how. >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > <orterun.c.gz>_______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel