Siegmar,
how did you configure openmpi ? which java version did you use ?
i just found a regression and you currently have to explicitly add
CFLAGS=-D_REENTRANT CPPFLAGS=-D_REENTRANT
to your configure command line
if you want to debug this issue (i cannot reproduce it on a solaris 11
x86 virtual machine)
you can apply the attached patch, and make sure you configure with
--enable-debug and run
OMPI_ATTACH=1 mpiexec -n 1 java InitFinalizeMain
then you will need to attach the *java* process with gdb, set the _dbg
local variable to zero and continue
you should get a clean stack trace and hopefully we will be able to help
Cheers,
Gilles
On 2014/10/24 0:03, Siegmar Gross wrote:
> Hello Oscar,
>
> do you have time to look into my problem? Probably Takahiro has a
> point and gdb behaves differently on Solaris and Linux, so that
> the differing outputs have no meaning. I tried to debug my Java
> program, but without success so far, because I wasn't able to get
> into the Java program to set a breakpoint or to see the code. Have
> you succeeded to debug a mpiJava program? If so, how must I call
> gdb (I normally use "gdb mipexec" and then "run -np 1 java ...")?
> What can I do to get helpful information to track the error down?
> I have attached the error log file. Perhaps you can see if something
> is going wrong with the Java interface. Thank you very much for your
> help and any hints for the usage of gdb with mpiJava in advance.
> Please let me know if I can provide anything else.
>
>
> Kind regards
>
> Siegmar
>
>
>>> I think that it must have to do with MPI, because everything
>>> works fine on Linux and my Java program works fine with an older
>>> MPI version (openmpi-1.8.2a1r31804) as well.
>> Yes. I also think it must have to do with MPI.
>> But java process side, not mpiexec process side.
>>
>> When you run Java MPI program via mpiexec, a mpiexec process
>> process launch a java process. When the java process (your
>> Java program) calls a MPI method, native part (written in C/C++)
>> of the MPI library is called. It runs in java process, not in
>> mpiexec process. I suspect that part.
>>
>>> On Solaris things are different.
>> Are you saying the following difference?
>> After this line,
>>> 881 ORTE_ACTIVATE_JOB_STATE(jdata, ORTE_JOB_STATE_INIT);
>> Linux shows
>>> orte_job_state_to_str (state=1)
>>> at ../../openmpi-dev-124-g91e9686/orte/util/error_strings.c:217
>>> 217 switch(state) {
>> but Solaris shows
>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:122
>>> 122 if (NULL == name) {
>> Each macro is defined as:
>>
>> #define ORTE_ACTIVATE_JOB_STATE(j, s) \
>> do { \
>> orte_job_t *shadow=(j); \
>> opal_output_verbose(1, orte_state_base_framework.framework_output, \
>> "%s ACTIVATE JOB %s STATE %s AT %s:%d", \
>> ORTE_NAME_PRINT(ORTE_PROC_MY_NAME), \
>> (NULL == shadow) ? "NULL" : \
>> ORTE_JOBID_PRINT(shadow->jobid), \
>> orte_job_state_to_str((s)), \
>> __FILE__, __LINE__); \
>> orte_state.activate_job_state(shadow, (s)); \
>> } while(0);
>>
>> #define ORTE_NAME_PRINT(n) \
>> orte_util_print_name_args(n)
>>
>> #define ORTE_JOBID_PRINT(n) \
>> orte_util_print_jobids(n)
>>
>> I'm not sure, but I think the gdb on Solaris steps into
>> orte_util_print_name_args, but gdb on Linux doesn't step into
>> orte_util_print_name_args and orte_util_print_jobids for some
>> reason, or orte_job_state_to_str is evaluated before them.
>>
>> So I think it's not an important difference.
>>
>> You showed the following lines.
>>>>> orterun (argc=5, argv=0xffffffff7fffe0d8)
>>>>> at
> ../../../../openmpi-dev-124-g91e9686/orte/tools/orterun/orterun.c:1084
>>>>> 1084 while (orte_event_base_active) {
>>>>> (gdb)
>>>>> 1085 opal_event_loop(orte_event_base, OPAL_EVLOOP_ONCE);
>>>>> (gdb)
>> I'm not familiar with this code but I think this part (in mpiexec
>> process) is only waiting the java process to terminate (normally
>> or abnormally). So I think the problem is not in a mpiexec process
>> but in a java process.
>>
>> Regards,
>> Takahiro
>>
>>> Hi Takahiro,
>>>
>>>> mpiexec and java run as distinct processes. Your JRE message
>>>> says java process raises SEGV. So you should trace the java
>>>> process, not the mpiexec process. And more, your JRE message
>>>> says the crash happened outside the Java Virtual Machine in
>>>> native code. So usual Java program debugger is useless.
>>>> You should trace native code part of the java process.
>>>> Unfortunately I don't know how to debug such one.
>>> I think that it must have to do with MPI, because everything
>>> works fine on Linux and my Java program works fine with an older
>>> MPI version (openmpi-1.8.2a1r31804) as well.
>>>
>>> linpc1 x 112 mpiexec -np 1 java InitFinalizeMain
>>> Hello!
>>> linpc1 x 113
>>>
>>> Therefore I single stepped through the program on Linux as well
>>> and found a difference launching the process. On Linux I get the
>>> following sequence.
>>>
>>> Breakpoint 1, rsh_launch (jdata=0x614aa0)
>>> at
> ../../../../../openmpi-dev-124-g91e9686/orte/mca/plm/rsh/plm_rsh_module.c:876
>>> 876 if (ORTE_FLAG_TEST(jdata, ORTE_JOB_FLAG_RESTART)) {
>>> (gdb) s
>>> 881 ORTE_ACTIVATE_JOB_STATE(jdata, ORTE_JOB_STATE_INIT);
>>> (gdb) s
>>> orte_job_state_to_str (state=1)
>>> at ../../openmpi-dev-124-g91e9686/orte/util/error_strings.c:217
>>> 217 switch(state) {
>>> (gdb)
>>> 221 return "PENDING INIT";
>>> (gdb)
>>> 317 }
>>> (gdb)
>>> orte_util_print_jobids (job=4294967295)
>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:170
>>> 170 ptr = get_print_name_buffer();
>>> (gdb)
>>>
>>>
>>>
>>> On Solaris things are different.
>>>
>>> Breakpoint 1, rsh_launch (jdata=0x100125250)
>>> at
> ../../../../../openmpi-dev-124-g91e9686/orte/mca/plm/rsh/plm_rsh_module.c:876
>>> 876 if (ORTE_FLAG_TEST(jdata, ORTE_JOB_FLAG_RESTART)) {
>>> (gdb) s
>>> 881 ORTE_ACTIVATE_JOB_STATE(jdata, ORTE_JOB_STATE_INIT);
>>> (gdb) s
>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:122
>>> 122 if (NULL == name) {
>>> (gdb)
>>> 142 job = orte_util_print_jobids(name->jobid);
>>> (gdb)
>>> orte_util_print_jobids (job=2673410048)
>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:170
>>> 170 ptr = get_print_name_buffer();
>>> (gdb)
>>>
>>>
>>>
>>> Is this normal or is it the reason for the crash on Solaris?
>>>
>>>
>>> Kind regards
>>>
>>> Siegmar
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>> The log file output by JRE may help you.
>>>>> # An error report file with more information is saved as:
>>>>> #
> /home/fd1026/work/skripte/master/parallel/prog/mpi/java/hs_err_pid13080.log
>>>> Regards,
>>>> Takahiro
>>>>
>>>>> Hi,
>>>>>
>>>>> I installed openmpi-dev-124-g91e9686 on Solaris 10 Sparc with
>>>>> gcc-4.9.1 to track down the error with my small Java program.
>>>>> I started single stepping in orterun.c at line 1081 and
>>>>> continued until I got the segmentation fault. I get
>>>>> "jdata = 0x0" in version openmpi-1.8.2a1r31804, which is the
>>>>> last one which works with Java in my environment, while I get
>>>>> "jdata = 0x100125250" in this version. Unfortunately I don't
>>>>> know which files or variables are important to look at. Perhaps
>>>>> somebody can look at the following lines of code and tell me,
>>>>> which information I should provide to solve the problem. I know
>>>>> that Solaris isn't any longer on your list of supported systems,
>>>>> but perhaps we can get it working again, if you tell me what
>>>>> you need and I do the debugging.
>>>>>
>>>>> /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
>>>>> GNU gdb (GDB) 7.6.1
>>>>> ...
>>>>> (gdb) run -np 1 java InitFinalizeMain
>>>>> Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec \
>>>>> -np 1 java InitFinalizeMain
>>>>> [Thread debugging using libthread_db enabled]
>>>>> [New Thread 1 (LWP 1)]
>>>>> [New LWP 2 ]
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> # SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=13064, tid=2
>>>>> ...
>>>>> [LWP 2 exited]
>>>>> [New Thread 2 ]
>>>>> [Switching to Thread 1 (LWP 1)]
>>>>> sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be
>>>>> found to satisfy query
>>>>> (gdb) thread 1
>>>>> [Switching to thread 1 (LWP 1 )]
>>>>> #0 0xffffffff7f6173d0 in rtld_db_dlactivity () from
> /usr/lib/sparcv9/ld.so.1
>>>>> (gdb) b orterun.c:1081
>>>>> Breakpoint 1 at 0x1000070dc: file
>>>>> ../../../../openmpi-dev-124-g91e9686/orte/tools/orterun/orterun.c, line
> 1081.
>>>>> (gdb) r
>>>>> The program being debugged has been started already.
>>>>> Start it from the beginning? (y or n) y
>>>>>
>>>>> Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 1 java
>>>>> InitFinalizeMain
>>>>> [Thread debugging using libthread_db enabled]
>>>>> [New Thread 1 (LWP 1)]
>>>>> [New LWP 2 ]
>>>>> [Switching to Thread 1 (LWP 1)]
>>>>>
>>>>> Breakpoint 1, orterun (argc=5, argv=0xffffffff7fffe0d8)
>>>>> at
> ../../../../openmpi-dev-124-g91e9686/orte/tools/orterun/orterun.c:1081
>>>>> 1081 rc = orte_plm.spawn(jdata);
>>>>> (gdb) print jdata
>>>>> $1 = (orte_job_t *) 0x100125250
>>>>> (gdb) s
>>>>> rsh_launch (jdata=0x100125250)
>>>>> at
>>>>>
> ../../../../../openmpi-dev-124-g91e9686/orte/mca/plm/rsh/plm_rsh_module.c:876
>>>>> 876 if (ORTE_FLAG_TEST(jdata, ORTE_JOB_FLAG_RESTART)) {
>>>>> (gdb) s
>>>>> 881 ORTE_ACTIVATE_JOB_STATE(jdata, ORTE_JOB_STATE_INIT);
>>>>> (gdb)
>>>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:122
>>>>> 122 if (NULL == name) {
>>>>> (gdb)
>>>>> 142 job = orte_util_print_jobids(name->jobid);
>>>>> (gdb)
>>>>> orte_util_print_jobids (job=2502885376) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:170
>>>>> 170 ptr = get_print_name_buffer();
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:92
>>>>> 92 if (!fns_init) {
>>>>> (gdb)
>>>>> 101 ret = opal_tsd_getspecific(print_args_tsd_key,
> (void**)&ptr);
>>>>> (gdb)
>>>>> opal_tsd_getspecific (key=1, valuep=0xffffffff7fffd990)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/threads/tsd.h:163
>>>>> 163 *valuep = pthread_getspecific(key);
>>>>> (gdb)
>>>>> 164 return OPAL_SUCCESS;
>>>>> (gdb)
>>>>> 165 }
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:102
>>>>> 102 if (OPAL_SUCCESS != ret) return NULL;
>>>>> (gdb)
>>>>> 104 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 113 return (orte_print_args_buffers_t*) ptr;
>>>>> (gdb)
>>>>> 114 }
>>>>> (gdb)
>>>>> orte_util_print_jobids (job=2502885376) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:172
>>>>> 172 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 178 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
>>>>> (gdb)
>>>>> 182 if (ORTE_JOBID_INVALID == job) {
>>>>> (gdb)
>>>>> 184 } else if (ORTE_JOBID_WILDCARD == job) {
>>>>> (gdb)
>>>>> 187 tmp1 = ORTE_JOB_FAMILY((unsigned long)job);
>>>>> (gdb)
>>>>> 188 tmp2 = ORTE_LOCAL_JOBID((unsigned long)job);
>>>>> (gdb)
>>>>> 189 snprintf(ptr->buffers[ptr->cntr++],
>>>>> (gdb)
>>>>> 193 return ptr->buffers[ptr->cntr-1];
>>>>> (gdb)
>>>>> 194 }
>>>>> (gdb)
>>>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:143
>>>>> 143 vpid = orte_util_print_vpids(name->vpid);
>>>>> (gdb)
>>>>> orte_util_print_vpids (vpid=0) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:260
>>>>> 260 ptr = get_print_name_buffer();
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:92
>>>>> 92 if (!fns_init) {
>>>>> (gdb)
>>>>> 101 ret = opal_tsd_getspecific(print_args_tsd_key,
> (void**)&ptr);
>>>>> (gdb)
>>>>> opal_tsd_getspecific (key=1, valuep=0xffffffff7fffd9a0)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/threads/tsd.h:163
>>>>> 163 *valuep = pthread_getspecific(key);
>>>>> (gdb)
>>>>> 164 return OPAL_SUCCESS;
>>>>> (gdb)
>>>>> 165 }
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:102
>>>>> 102 if (OPAL_SUCCESS != ret) return NULL;
>>>>> (gdb)
>>>>> 104 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 113 return (orte_print_args_buffers_t*) ptr;
>>>>> (gdb)
>>>>> 114 }
>>>>> (gdb)
>>>>> orte_util_print_vpids (vpid=0) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:262
>>>>> 262 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 268 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
>>>>> (gdb)
>>>>> 272 if (ORTE_VPID_INVALID == vpid) {
>>>>> (gdb)
>>>>> 274 } else if (ORTE_VPID_WILDCARD == vpid) {
>>>>> (gdb)
>>>>> 277 snprintf(ptr->buffers[ptr->cntr++],
>>>>> (gdb)
>>>>> 281 return ptr->buffers[ptr->cntr-1];
>>>>> (gdb)
>>>>> 282 }
>>>>> (gdb)
>>>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:146
>>>>> 146 ptr = get_print_name_buffer();
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:92
>>>>> 92 if (!fns_init) {
>>>>> (gdb)
>>>>> 101 ret = opal_tsd_getspecific(print_args_tsd_key,
> (void**)&ptr);
>>>>> (gdb)
>>>>> opal_tsd_getspecific (key=1, valuep=0xffffffff7fffda60)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/threads/tsd.h:163
>>>>> 163 *valuep = pthread_getspecific(key);
>>>>> (gdb)
>>>>> 164 return OPAL_SUCCESS;
>>>>> (gdb)
>>>>> 165 }
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:102
>>>>> 102 if (OPAL_SUCCESS != ret) return NULL;
>>>>> (gdb)
>>>>> 104 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 113 return (orte_print_args_buffers_t*) ptr;
>>>>> (gdb)
>>>>> 114 }
>>>>> (gdb)
>>>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:148
>>>>> 148 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 154 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
>>>>> (gdb)
>>>>> 158 snprintf(ptr->buffers[ptr->cntr++],
>>>>> (gdb)
>>>>> 162 return ptr->buffers[ptr->cntr-1];
>>>>> (gdb)
>>>>> 163 }
>>>>> (gdb)
>>>>> orte_util_print_jobids (job=4294967295) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:170
>>>>> 170 ptr = get_print_name_buffer();
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:92
>>>>> 92 if (!fns_init) {
>>>>> (gdb)
>>>>> 101 ret = opal_tsd_getspecific(print_args_tsd_key,
> (void**)&ptr);
>>>>> (gdb)
>>>>> opal_tsd_getspecific (key=1, valuep=0xffffffff7fffda60)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/threads/tsd.h:163
>>>>> 163 *valuep = pthread_getspecific(key);
>>>>> (gdb)
>>>>> 164 return OPAL_SUCCESS;
>>>>> (gdb)
>>>>> 165 }
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:102
>>>>> 102 if (OPAL_SUCCESS != ret) return NULL;
>>>>> (gdb)
>>>>> 104 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 113 return (orte_print_args_buffers_t*) ptr;
>>>>> (gdb)
>>>>> 114 }
>>>>> (gdb)
>>>>> orte_util_print_jobids (job=4294967295) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:172
>>>>> 172 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 178 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
>>>>> (gdb)
>>>>> 182 if (ORTE_JOBID_INVALID == job) {
>>>>> (gdb)
>>>>> 183 snprintf(ptr->buffers[ptr->cntr++],
>>>>> ORTE_PRINT_NAME_ARGS_MAX_SIZE, "[INVALID]");
>>>>> (gdb)
>>>>> 193 return ptr->buffers[ptr->cntr-1];
>>>>> (gdb)
>>>>> 194 }
>>>>> (gdb)
>>>>> orte_job_state_to_str (state=1) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/error_strings.c:217
>>>>> 217 switch(state) {
>>>>> (gdb)
>>>>> 221 return "PENDING INIT";
>>>>> (gdb)
>>>>> 317 }
>>>>> (gdb)
>>>>> opal_output_verbose (level=1, output_id=0,
>>>>> format=0xffffffff7f14dd98 <orte_job_states>
>>>>> "\336\257\276\355\336\257\276\355")
>>>>> at ../../../openmpi-dev-124-g91e9686/opal/util/output.c:373
>>>>> 373 va_start(arglist, format);
>>>>> (gdb)
>>>>> 369 {
>>>>> (gdb)
>>>>> 370 if (output_id >= 0 && output_id < OPAL_OUTPUT_MAX_STREAMS &&
>>>>> (gdb)
>>>>> 377 }
>>>>> (gdb)
>>>>> orte_state_base_activate_job_state (jdata=0x100125250, state=1)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_fns.c:33
>>>>> 33 opal_list_item_t *itm, *any=NULL, *error=NULL;
>>>>> (gdb)
>>>>> 37 for (itm = opal_list_get_first(&orte_job_states);
>>>>> (gdb)
>>>>> opal_list_get_first (list=0xffffffff7f14dd98 <orte_job_states>)
>>>>> at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_list.h:320
>>>>> 320 opal_list_item_t* item =
>>>>> (opal_list_item_t*)list->opal_list_sentinel.opal_list_next;
>>>>> (gdb)
>>>>> 324 assert(1 == item->opal_list_item_refcount);
>>>>> (gdb)
>>>>> 325 assert( list == item->opal_list_item_belong_to );
>>>>> (gdb)
>>>>> 328 return item;
>>>>> (gdb)
>>>>> 329 }
>>>>> (gdb)
>>>>> orte_state_base_activate_job_state (jdata=0x100125250, state=1)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_fns.c:38
>>>>> 38 itm != opal_list_get_end(&orte_job_states);
>>>>> (gdb)
>>>>> opal_list_get_end (list=0xffffffff7f14dd98 <orte_job_states>)
>>>>> at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_list.h:399
>>>>> 399 return &(list->opal_list_sentinel);
>>>>> (gdb)
>>>>> 400 }
>>>>> (gdb)
>>>>> orte_state_base_activate_job_state (jdata=0x100125250, state=1)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_fns.c:37
>>>>> 37 for (itm = opal_list_get_first(&orte_job_states);
>>>>> (gdb)
>>>>> 40 s = (orte_state_t*)itm;
>>>>> (gdb)
>>>>> 41 if (s->job_state == ORTE_JOB_STATE_ANY) {
>>>>> (gdb)
>>>>> 45 if (s->job_state == ORTE_JOB_STATE_ERROR) {
>>>>> (gdb)
>>>>> 48 if (s->job_state == state) {
>>>>> (gdb)
>>>>> 49 OPAL_OUTPUT_VERBOSE((1,
>>>>> orte_state_base_framework.framework_output,
>>>>> (gdb)
>>>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:122
>>>>> 122 if (NULL == name) {
>>>>> (gdb)
>>>>> 142 job = orte_util_print_jobids(name->jobid);
>>>>> (gdb)
>>>>> orte_util_print_jobids (job=2502885376) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:170
>>>>> 170 ptr = get_print_name_buffer();
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:92
>>>>> 92 if (!fns_init) {
>>>>> (gdb)
>>>>> 101 ret = opal_tsd_getspecific(print_args_tsd_key,
> (void**)&ptr);
>>>>> (gdb)
>>>>> opal_tsd_getspecific (key=1, valuep=0xffffffff7fffd880)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/threads/tsd.h:163
>>>>> 163 *valuep = pthread_getspecific(key);
>>>>> (gdb)
>>>>> 164 return OPAL_SUCCESS;
>>>>> (gdb)
>>>>> 165 }
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:102
>>>>> 102 if (OPAL_SUCCESS != ret) return NULL;
>>>>> (gdb)
>>>>> 104 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 113 return (orte_print_args_buffers_t*) ptr;
>>>>> (gdb)
>>>>> 114 }
>>>>> (gdb)
>>>>> orte_util_print_jobids (job=2502885376) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:172
>>>>> 172 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 178 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
>>>>> (gdb)
>>>>> 182 if (ORTE_JOBID_INVALID == job) {
>>>>> (gdb)
>>>>> 184 } else if (ORTE_JOBID_WILDCARD == job) {
>>>>> (gdb)
>>>>> 187 tmp1 = ORTE_JOB_FAMILY((unsigned long)job);
>>>>> (gdb)
>>>>> 188 tmp2 = ORTE_LOCAL_JOBID((unsigned long)job);
>>>>> (gdb)
>>>>> 189 snprintf(ptr->buffers[ptr->cntr++],
>>>>> (gdb)
>>>>> 193 return ptr->buffers[ptr->cntr-1];
>>>>> (gdb)
>>>>> 194 }
>>>>> (gdb)
>>>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:143
>>>>> 143 vpid = orte_util_print_vpids(name->vpid);
>>>>> (gdb)
>>>>> orte_util_print_vpids (vpid=0) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:260
>>>>> 260 ptr = get_print_name_buffer();
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:92
>>>>> 92 if (!fns_init) {
>>>>> (gdb)
>>>>> 101 ret = opal_tsd_getspecific(print_args_tsd_key,
> (void**)&ptr);
>>>>> (gdb)
>>>>> opal_tsd_getspecific (key=1, valuep=0xffffffff7fffd890)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/threads/tsd.h:163
>>>>> 163 *valuep = pthread_getspecific(key);
>>>>> (gdb)
>>>>> 164 return OPAL_SUCCESS;
>>>>> (gdb)
>>>>> 165 }
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:102
>>>>> 102 if (OPAL_SUCCESS != ret) return NULL;
>>>>> (gdb)
>>>>> 104 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 113 return (orte_print_args_buffers_t*) ptr;
>>>>> (gdb)
>>>>> 114 }
>>>>> (gdb)
>>>>> orte_util_print_vpids (vpid=0) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:262
>>>>> 262 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 268 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
>>>>> (gdb)
>>>>> 272 if (ORTE_VPID_INVALID == vpid) {
>>>>> (gdb)
>>>>> 274 } else if (ORTE_VPID_WILDCARD == vpid) {
>>>>> (gdb)
>>>>> 277 snprintf(ptr->buffers[ptr->cntr++],
>>>>> (gdb)
>>>>> 281 return ptr->buffers[ptr->cntr-1];
>>>>> (gdb)
>>>>> 282 }
>>>>> (gdb)
>>>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:146
>>>>> 146 ptr = get_print_name_buffer();
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:92
>>>>> 92 if (!fns_init) {
>>>>> (gdb)
>>>>> 101 ret = opal_tsd_getspecific(print_args_tsd_key,
> (void**)&ptr);
>>>>> (gdb)
>>>>> opal_tsd_getspecific (key=1, valuep=0xffffffff7fffd950)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/threads/tsd.h:163
>>>>> 163 *valuep = pthread_getspecific(key);
>>>>> (gdb)
>>>>> 164 return OPAL_SUCCESS;
>>>>> (gdb)
>>>>> 165 }
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:102
>>>>> 102 if (OPAL_SUCCESS != ret) return NULL;
>>>>> (gdb)
>>>>> 104 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 113 return (orte_print_args_buffers_t*) ptr;
>>>>> (gdb)
>>>>> 114 }
>>>>> (gdb)
>>>>> orte_util_print_name_args (name=0x100118380 <orte_process_info+104>)
>>>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:148
>>>>> 148 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 154 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
>>>>> (gdb)
>>>>> 158 snprintf(ptr->buffers[ptr->cntr++],
>>>>> (gdb)
>>>>> 162 return ptr->buffers[ptr->cntr-1];
>>>>> (gdb)
>>>>> 163 }
>>>>> (gdb)
>>>>> orte_util_print_jobids (job=4294967295) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:170
>>>>> 170 ptr = get_print_name_buffer();
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:92
>>>>> 92 if (!fns_init) {
>>>>> (gdb)
>>>>> 101 ret = opal_tsd_getspecific(print_args_tsd_key,
> (void**)&ptr);
>>>>> (gdb)
>>>>> opal_tsd_getspecific (key=1, valuep=0xffffffff7fffd950)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/threads/tsd.h:163
>>>>> 163 *valuep = pthread_getspecific(key);
>>>>> (gdb)
>>>>> 164 return OPAL_SUCCESS;
>>>>> (gdb)
>>>>> 165 }
>>>>> (gdb)
>>>>> get_print_name_buffer () at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:102
>>>>> 102 if (OPAL_SUCCESS != ret) return NULL;
>>>>> (gdb)
>>>>> 104 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 113 return (orte_print_args_buffers_t*) ptr;
>>>>> (gdb)
>>>>> 114 }
>>>>> (gdb)
>>>>> orte_util_print_jobids (job=4294967295) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:172
>>>>> 172 if (NULL == ptr) {
>>>>> (gdb)
>>>>> 178 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
>>>>> (gdb)
>>>>> 182 if (ORTE_JOBID_INVALID == job) {
>>>>> (gdb)
>>>>> 183 snprintf(ptr->buffers[ptr->cntr++],
>>>>> ORTE_PRINT_NAME_ARGS_MAX_SIZE, "[INVALID]");
>>>>> (gdb)
>>>>> 193 return ptr->buffers[ptr->cntr-1];
>>>>> (gdb)
>>>>> 194 }
>>>>> (gdb)
>>>>> orte_job_state_to_str (state=1) at
>>>>> ../../openmpi-dev-124-g91e9686/orte/util/error_strings.c:217
>>>>> 217 switch(state) {
>>>>> (gdb)
>>>>> 221 return "PENDING INIT";
>>>>> (gdb)
>>>>> 317 }
>>>>> (gdb)
>>>>> opal_output_verbose (level=1, output_id=-1, format=0x1 <Address 0x1 out
> of
>>>>> bounds>)
>>>>> at ../../../openmpi-dev-124-g91e9686/opal/util/output.c:373
>>>>> 373 va_start(arglist, format);
>>>>> (gdb)
>>>>> 369 {
>>>>> (gdb)
>>>>> 370 if (output_id >= 0 && output_id < OPAL_OUTPUT_MAX_STREAMS &&
>>>>> (gdb)
>>>>> 377 }
>>>>> (gdb)
>>>>> orte_state_base_activate_job_state (jdata=0x100125250, state=1)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_fns.c:54
>>>>> 54 if (NULL == s->cbfunc) {
>>>>> (gdb)
>>>>> 62 caddy = OBJ_NEW(orte_state_caddy_t);
>>>>> (gdb)
>>>>> opal_obj_new_debug (type=0xffffffff7f14c7d8 <orte_state_caddy_t_class>,
>>>>> file=0xffffffff7f034c08
>>>>>
> "../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_fns.c",
>>>>> line=62) at
> ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:249
>>>>> 249 opal_object_t* object = opal_obj_new(type);
>>>>> (gdb)
>>>>> opal_obj_new (cls=0xffffffff7f14c7d8 <orte_state_caddy_t_class>)
>>>>> at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:465
>>>>> 465 assert(cls->cls_sizeof >= sizeof(opal_object_t));
>>>>> (gdb)
>>>>> 470 object = (opal_object_t *) malloc(cls->cls_sizeof);
>>>>> (gdb)
>>>>> 472 if (0 == cls->cls_initialized) {
>>>>> (gdb)
>>>>> 473 opal_class_initialize(cls);
>>>>> (gdb)
>>>>> opal_class_initialize (cls=0xffffffff7f14c7d8
> <orte_state_caddy_t_class>)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/class/opal_object.c:79
>>>>> 79 assert(cls);
>>>>> (gdb)
>>>>> 84 if (1 == cls->cls_initialized) {
>>>>> (gdb)
>>>>> 87 opal_atomic_lock(&class_lock);
>>>>> (gdb)
>>>>> opal_atomic_lock (lock=0xffffffff7ee89bf0 <class_lock>)
>>>>> at
> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/atomic_impl.h:397
>>>>> 397 while( !opal_atomic_cmpset_acq_32( &(lock->u.lock),
>>>>> (gdb)
>>>>> opal_atomic_cmpset_acq_32 (addr=0xffffffff7ee89bf0 <class_lock>,
> oldval=0,
>>>>> newval=1)
>>>>> at
> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/sparcv9/atomic.h:107
>>>>> 107 rc = opal_atomic_cmpset_32(addr, oldval, newval);
>>>>> (gdb)
>>>>> opal_atomic_cmpset_32 (addr=0xffffffff7ee89bf0 <class_lock>, oldval=0,
> newval=1)
>>>>> at
> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/sparcv9/atomic.h:93
>>>>> 93 int32_t ret = newval;
>>>>> (gdb)
>>>>> 95 __asm__ __volatile__("casa [%1] " ASI_P ", %2, %0"
>>>>> (gdb)
>>>>> 98 return (ret == oldval);
>>>>> (gdb)
>>>>> 99 }
>>>>> (gdb)
>>>>> opal_atomic_cmpset_acq_32 (addr=0xffffffff7ee89bf0 <class_lock>,
> oldval=0,
>>>>> newval=1)
>>>>> at
> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/sparcv9/atomic.h:108
>>>>> 108 opal_atomic_rmb();
>>>>> (gdb)
>>>>> opal_atomic_rmb () at
>>>>> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/sparcv9/atomic.h:63
>>>>> 63 MEMBAR("#LoadLoad");
>>>>> (gdb)
>>>>> 64 }
>>>>> (gdb)
>>>>> opal_atomic_cmpset_acq_32 (addr=0xffffffff7ee89bf0 <class_lock>,
> oldval=0,
>>>>> newval=1)
>>>>> at
> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/sparcv9/atomic.h:110
>>>>> 110 return rc;
>>>>> (gdb)
>>>>> 111 }
>>>>> (gdb)
>>>>> opal_atomic_lock (lock=0xffffffff7ee89bf0 <class_lock>)
>>>>> at
> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/atomic_impl.h:403
>>>>> 403 }
>>>>> (gdb)
>>>>> opal_class_initialize (cls=0xffffffff7f14c7d8
> <orte_state_caddy_t_class>)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/class/opal_object.c:93
>>>>> 93 if (1 == cls->cls_initialized) {
>>>>> (gdb)
>>>>> 103 cls->cls_depth = 0;
>>>>> (gdb)
>>>>> 104 cls_construct_array_count = 0;
>>>>> (gdb)
>>>>> 105 cls_destruct_array_count = 0;
>>>>> (gdb)
>>>>> 106 for (c = cls; c; c = c->cls_parent) {
>>>>> (gdb)
>>>>> 107 if( NULL != c->cls_construct ) {
>>>>> (gdb)
>>>>> 108 cls_construct_array_count++;
>>>>> (gdb)
>>>>> 110 if( NULL != c->cls_destruct ) {
>>>>> (gdb)
>>>>> 111 cls_destruct_array_count++;
>>>>> (gdb)
>>>>> 113 cls->cls_depth++;
>>>>> (gdb)
>>>>> 106 for (c = cls; c; c = c->cls_parent) {
>>>>> (gdb)
>>>>> 107 if( NULL != c->cls_construct ) {
>>>>> (gdb)
>>>>> 110 if( NULL != c->cls_destruct ) {
>>>>> (gdb)
>>>>> 113 cls->cls_depth++;
>>>>> (gdb)
>>>>> 106 for (c = cls; c; c = c->cls_parent) {
>>>>> (gdb)
>>>>> 122 (void
> (**)(opal_object_t*))malloc((cls_construct_array_count +
>>>>> (gdb)
>>>>> 123
> cls_destruct_array_count + 2)
>>>>> *
>>>>> (gdb)
>>>>> 122 (void
> (**)(opal_object_t*))malloc((cls_construct_array_count +
>>>>> (gdb)
>>>>> 121 cls->cls_construct_array =
>>>>> (gdb)
>>>>> 125 if (NULL == cls->cls_construct_array) {
>>>>> (gdb)
>>>>> 130 cls->cls_construct_array + cls_construct_array_count +
> 1;
>>>>> (gdb)
>>>>> 129 cls->cls_destruct_array =
>>>>> (gdb)
>>>>> 136 cls_construct_array = cls->cls_construct_array +
>>>>> cls_construct_array_count;
>>>>> (gdb)
>>>>> 137 cls_destruct_array = cls->cls_destruct_array;
>>>>> (gdb)
>>>>> 139 c = cls;
>>>>> (gdb)
>>>>> 140 *cls_construct_array = NULL; /* end marker for the
> constructors */
>>>>> (gdb)
>>>>> 141 for (i = 0; i < cls->cls_depth; i++) {
>>>>> (gdb)
>>>>> 142 if( NULL != c->cls_construct ) {
>>>>> (gdb)
>>>>> 143 --cls_construct_array;
>>>>> (gdb)
>>>>> 144 *cls_construct_array = c->cls_construct;
>>>>> (gdb)
>>>>> 146 if( NULL != c->cls_destruct ) {
>>>>> (gdb)
>>>>> 147 *cls_destruct_array = c->cls_destruct;
>>>>> (gdb)
>>>>> 148 cls_destruct_array++;
>>>>> (gdb)
>>>>> 150 c = c->cls_parent;
>>>>> (gdb)
>>>>> 141 for (i = 0; i < cls->cls_depth; i++) {
>>>>> (gdb)
>>>>> 142 if( NULL != c->cls_construct ) {
>>>>> (gdb)
>>>>> 146 if( NULL != c->cls_destruct ) {
>>>>> (gdb)
>>>>> 150 c = c->cls_parent;
>>>>> (gdb)
>>>>> 141 for (i = 0; i < cls->cls_depth; i++) {
>>>>> (gdb)
>>>>> 152 *cls_destruct_array = NULL; /* end marker for the
> destructors */
>>>>> (gdb)
>>>>> 154 cls->cls_initialized = 1;
>>>>> (gdb)
>>>>> 155 save_class(cls);
>>>>> (gdb)
>>>>> save_class (cls=0xffffffff7f14c7d8 <orte_state_caddy_t_class>)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/class/opal_object.c:188
>>>>> 188 if (num_classes >= max_classes) {
>>>>> (gdb)
>>>>> 189 expand_array();
>>>>> (gdb)
>>>>> expand_array () at
> ../../openmpi-dev-124-g91e9686/opal/class/opal_object.c:201
>>>>> 201 max_classes += increment;
>>>>> (gdb)
>>>>> 202 classes = (void**)realloc(classes, sizeof(opal_class_t*) *
>>>>> max_classes);
>>>>> (gdb)
>>>>> 203 if (NULL == classes) {
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 208 classes[i] = NULL;
>>>>> (gdb)
>>>>> 207 for (i = num_classes; i < max_classes; ++i) {
>>>>> (gdb)
>>>>> 210 }
>>>>> (gdb)
>>>>> save_class (cls=0xffffffff7f14c7d8 <orte_state_caddy_t_class>)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/class/opal_object.c:192
>>>>> 192 classes[num_classes] = cls->cls_construct_array;
>>>>> (gdb)
>>>>> 193 ++num_classes;
>>>>> (gdb)
>>>>> 194 }
>>>>> (gdb)
>>>>> opal_class_initialize (cls=0xffffffff7f14c7d8
> <orte_state_caddy_t_class>)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/class/opal_object.c:159
>>>>> 159 opal_atomic_unlock(&class_lock);
>>>>> (gdb)
>>>>> opal_atomic_unlock (lock=0xffffffff7ee89bf0 <class_lock>)
>>>>> at
> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/atomic_impl.h:409
>>>>> 409 opal_atomic_wmb();
>>>>> (gdb)
>>>>> opal_atomic_wmb () at
>>>>> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/sparcv9/atomic.h:69
>>>>> 69 MEMBAR("#StoreStore");
>>>>> (gdb)
>>>>> 70 }
>>>>> (gdb)
>>>>> opal_atomic_unlock (lock=0xffffffff7ee89bf0 <class_lock>)
>>>>> at
> ../../openmpi-dev-124-g91e9686/opal/include/opal/sys/atomic_impl.h:410
>>>>> 410 lock->u.lock=OPAL_ATOMIC_UNLOCKED;
>>>>> (gdb)
>>>>> 411 }
>>>>> (gdb)
>>>>> opal_class_initialize (cls=0xffffffff7f14c7d8
> <orte_state_caddy_t_class>)
>>>>> at ../../openmpi-dev-124-g91e9686/opal/class/opal_object.c:160
>>>>> 160 }
>>>>> (gdb)
>>>>> opal_obj_new (cls=0xffffffff7f14c7d8 <orte_state_caddy_t_class>)
>>>>> at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:475
>>>>> 475 if (NULL != object) {
>>>>> (gdb)
>>>>> 476 object->obj_class = cls;
>>>>> (gdb)
>>>>> 477 object->obj_reference_count = 1;
>>>>> (gdb)
>>>>> 478 opal_obj_run_constructors(object);
>>>>> (gdb)
>>>>> opal_obj_run_constructors (object=0x1001bfcf0)
>>>>> at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:420
>>>>> 420 assert(NULL != object->obj_class);
>>>>> (gdb)
>>>>> 422 cls_construct = object->obj_class->cls_construct_array;
>>>>> (gdb)
>>>>> 423 while( NULL != *cls_construct ) {
>>>>> (gdb)
>>>>> 424 (*cls_construct)(object);
>>>>> (gdb)
>>>>> orte_state_caddy_construct (caddy=0x1001bfcf0)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_frame.c:84
>>>>> 84 memset(&caddy->ev, 0, sizeof(opal_event_t));
>>>>> (gdb)
>>>>> 85 caddy->jdata = NULL;
>>>>> (gdb)
>>>>> 86 }
>>>>> (gdb)
>>>>> opal_obj_run_constructors (object=0x1001bfcf0)
>>>>> at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:425
>>>>> 425 cls_construct++;
>>>>> (gdb)
>>>>> 423 while( NULL != *cls_construct ) {
>>>>> (gdb)
>>>>> 427 }
>>>>> (gdb)
>>>>> opal_obj_new (cls=0xffffffff7f14c7d8 <orte_state_caddy_t_class>)
>>>>> at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:480
>>>>> 480 return object;
>>>>> (gdb)
>>>>> 481 }
>>>>> (gdb)
>>>>> opal_obj_new_debug (type=0xffffffff7f14c7d8 <orte_state_caddy_t_class>,
>>>>> file=0xffffffff7f034c08
>>>>>
> "../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_fns.c",
>>>>> line=62) at
> ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:250
>>>>> 250 object->obj_magic_id = OPAL_OBJ_MAGIC_ID;
>>>>> (gdb)
>>>>> 251 object->cls_init_file_name = file;
>>>>> (gdb)
>>>>> 252 object->cls_init_lineno = line;
>>>>> (gdb)
>>>>> 253 return object;
>>>>> (gdb)
>>>>> 254 }
>>>>> (gdb)
>>>>> orte_state_base_activate_job_state (jdata=0x100125250, state=1)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_fns.c:63
>>>>> 63 if (NULL != jdata) {
>>>>> (gdb)
>>>>> 64 caddy->jdata = jdata;
>>>>> (gdb)
>>>>> 65 caddy->job_state = state;
>>>>> (gdb)
>>>>> 66 OBJ_RETAIN(jdata);
>>>>> (gdb)
>>>>> opal_obj_update (inc=1, object=0x100125250)
>>>>> at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:497
>>>>> 497 return opal_atomic_add_32(&(object->obj_reference_count),
> inc);
>>>>> (gdb)
>>>>> opal_atomic_add_32 (addr=0x100125260, delta=1)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/opal/include/opal/sys/atomic_impl.h:63
>>>>> 63 oldval = *addr;
>>>>> (gdb)
>>>>> 64 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval +
> delta));
>>>>> (gdb)
>>>>> opal_atomic_cmpset_32 (addr=0x100125260, oldval=1, newval=2)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/opal/include/opal/sys/sparcv9/atomic.h:93
>>>>> 93 int32_t ret = newval;
>>>>> (gdb)
>>>>> 95 __asm__ __volatile__("casa [%1] " ASI_P ", %2, %0"
>>>>> (gdb)
>>>>> 98 return (ret == oldval);
>>>>> (gdb)
>>>>> 99 }
>>>>> (gdb)
>>>>> opal_atomic_add_32 (addr=0x100125260, delta=1)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/opal/include/opal/sys/atomic_impl.h:65
>>>>> 65 return (oldval + delta);
>>>>> (gdb)
>>>>> 66 }
>>>>> (gdb)
>>>>> orte_state_base_activate_job_state (jdata=0x100125250, state=1)
>>>>> at
>>>>>
> ../../../../openmpi-dev-124-g91e9686/orte/mca/state/base/state_base_fns.c:66
>>>>> 66 OBJ_RETAIN(jdata);
>>>>> (gdb)
>>>>> 68 opal_event_set(orte_event_base, &caddy->ev, -1,
>>>>> OPAL_EV_WRITE, s->cbfunc, caddy);
>>>>> (gdb)
>>>>> 69 opal_event_set_priority(&caddy->ev, s->priority);
>>>>> (gdb)
>>>>> 70 opal_event_active(&caddy->ev, OPAL_EV_WRITE, 1);
>>>>> (gdb)
>>>>> 71 return;
>>>>> (gdb)
>>>>> 105 }
>>>>> (gdb)
>>>>> rsh_launch (jdata=0x100125250)
>>>>> at
>>>>>
> ../../../../../openmpi-dev-124-g91e9686/orte/mca/plm/rsh/plm_rsh_module.c:883
>>>>> 883 return ORTE_SUCCESS;
>>>>> (gdb)
>>>>> 884 }
>>>>> (gdb)
>>>>> orterun (argc=5, argv=0xffffffff7fffe0d8)
>>>>> at
> ../../../../openmpi-dev-124-g91e9686/orte/tools/orterun/orterun.c:1084
>>>>> 1084 while (orte_event_base_active) {
>>>>> (gdb)
>>>>> 1085 opal_event_loop(orte_event_base, OPAL_EVLOOP_ONCE);
>>>>> (gdb)
>>>>> 1084 while (orte_event_base_active) {
>>>>> (gdb)
>>>>> 1085 opal_event_loop(orte_event_base, OPAL_EVLOOP_ONCE);
>>>>> (gdb)
>>>>> 1084 while (orte_event_base_active) {
>>>>> (gdb)
>>>>> 1085 opal_event_loop(orte_event_base, OPAL_EVLOOP_ONCE);
>>>>> (gdb)
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> # SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=13080, tid=2
>>>>> #
>>>>> # JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build
> 1.8.0-b132)
>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode
> solaris-sparc
>>>>> compressed oops)
>>>>> # Problematic frame:
>>>>> # 1084 while (orte_event_base_active) {
>>>>> (gdb)
>>>>> 1085 opal_event_loop(orte_event_base, OPAL_EVLOOP_ONCE);
>>>>> (gdb)
>>>>> C [libc.so.1+0x3c7f0] strlen+0x50
>>>>> #
>>>>> # Failed to write core dump. Core dumps have been disabled. To enable
> core
>>>>> dumping, try "ulimit -c unlimited" before starting Java again
>>>>> #
>>>>> # An error report file with more information is saved as:
>>>>> #
> /home/fd1026/work/skripte/master/parallel/prog/mpi/java/hs_err_pid13080.log
>>>>> #
>>>>> # If you would like to submit a bug report, please visit:
>>>>> # http://bugreport.sun.com/bugreport/crash.jsp
>>>>> # The crash happened outside the Java Virtual Machine in native code.
>>>>> # See problematic frame for where to report the bug.
>>>>> #
>>>>>
> --------------------------------------------------------------------------
>>>>> mpiexec noticed that process rank 0 with PID 0 on node tyr exited on
> signal 6
>>>>> (Abort).
>>>>>
> --------------------------------------------------------------------------
>>>>> 1084 while (orte_event_base_active) {
>>>>> (gdb)
>>>>> 1089 orte_odls.kill_local_procs(NULL);
>>>>> (gdb)
>>>>>
>>>>>
>>>>> Thank you very much for any help in advance.
>>>>>
>>>>> Kind regards
>>>>>
>>>>> Siegmar
>> _______________________________________________
>> users mailing list
>> [email protected]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/10/25559.php
>
>
> _______________________________________________
> users mailing list
> [email protected]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/10/25562.php
diff --git a/ompi/mpi/java/c/mpi_MPI.c b/ompi/mpi/java/c/mpi_MPI.c
index 7c3a3ba..219da6e 100644
--- a/ompi/mpi/java/c/mpi_MPI.c
+++ b/ompi/mpi/java/c/mpi_MPI.c
@@ -62,6 +62,7 @@
#include <sys/stat.h>
#endif
#include <dlfcn.h>
+#include <poll.h>
#include "opal/util/output.h"
#include "opal/datatype/opal_convertor.h"
@@ -121,6 +122,11 @@ OBJ_CLASS_INSTANCE(ompi_java_buffer_t,
*/
jint JNI_OnLoad(JavaVM *vm, void *reserved)
{
+ char *env = getenv("OMPI_ATTACH");
+ if (NULL != env && 0 < atoi(env)) {
+ volatile int _dbg = 1;
+ while (_dbg) poll(NULL, 0, 1);
+ }
libmpi = dlopen("libmpi." OPAL_DYN_LIB_SUFFIX, RTLD_NOW | RTLD_GLOBAL);
if(libmpi == NULL)