Hi, today a tested some small Java programs with openmpi-dev-178-ga16c1e4. One program throws an exception ArrayIndexOutOfBoundsException. The program worked fine in older MPI versions, e.g., openmpi-1.8.2a1r31804.
tyr java 138 mpiexec -np 2 java MsgSendRecvMain Now 1 process sends its greetings. Greetings from process 1: message tag: 3 message length: 26 message: tyr.informatik.hs-fulda.de??????????????????????????????????????????????????????????????????????????????? ?????????????????????????????????????????????????????????????????????????????? Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException at mpi.Comm.recv(Native Method) at mpi.Comm.recv(Comm.java:391) at MsgSendRecvMain.main(MsgSendRecvMain.java:92) ... The exception happens also on my Linux box. linpc1 java 102 mpijavac MsgSendRecvMain.java linpc1 java 103 mpiexec -np 2 java MsgSendRecvMain Now 1 process sends its greetings. Greetings from process 1: message tag: 3 message length: 6 message: linpc1?????%???%?????%?f?%?%???$?????????? Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException at mpi.Comm.recv(Native Method) at mpi.Comm.recv(Comm.java:391) at MsgSendRecvMain.main(MsgSendRecvMain.java:92) ... tyr java 139 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec ... (gdb) run -np 2 java MsgSendRecvMain Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 2 java MsgSendRecvMain [Thread debugging using libthread_db enabled] [New Thread 1 (LWP 1)] [New LWP 2 ] Now 1 process sends its greetings. Greetings from process 1: message tag: 3 message length: 26 message: tyr.informatik.hs-fulda.de Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException at mpi.Comm.recv(Native Method) at mpi.Comm.recv(Comm.java:391) at MsgSendRecvMain.main(MsgSendRecvMain.java:92) ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[61564,1],1] Exit code: 1 -------------------------------------------------------------------------- [LWP 2 exited] [New Thread 2 ] [Switching to Thread 1 (LWP 1)] sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to satisfy query (gdb) bt #0 0xffffffff7f6173d0 in rtld_db_dlactivity () from /usr/lib/sparcv9/ld.so.1 #1 0xffffffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1 #2 0xffffffff7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1 #3 0xffffffff7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1 #4 0xffffffff7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1 #5 0xffffffff7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1 #6 0xffffffff7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1 #7 0xffffffff7f61db0c in dlclose () from /usr/lib/sparcv9/ld.so.1 #8 0xffffffff7ec87ca0 in vm_close () from /usr/local/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0 #9 0xffffffff7ec85274 in lt_dlclose () from /usr/local/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0 #10 0xffffffff7ecaa5dc in ri_destructor (obj=0x100187b70) at ../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_component_repository.c:382 #11 0xffffffff7eca8fd8 in opal_obj_run_destructors (object=0x100187b70) at ../../../../openmpi-dev-178-ga16c1e4/opal/class/opal_object.h:446 #12 0xffffffff7eca9eac in mca_base_component_repository_release ( component=0xffffffff7b1236f0 <mca_oob_tcp_component>) at ../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_component_repository.c:240 #13 0xffffffff7ecac17c in mca_base_component_unload ( component=0xffffffff7b1236f0 <mca_oob_tcp_component>, output_id=-1) at ../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:47 #14 0xffffffff7ecac210 in mca_base_component_close ( component=0xffffffff7b1236f0 <mca_oob_tcp_component>, output_id=-1) at ../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:60 #15 0xffffffff7ecac2e4 in mca_base_components_close (output_id=-1, components=0xffffffff7f14bc58 <orte_oob_base_framework+80>, skip=0x0) at ../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:86 #16 0xffffffff7ecac24c in mca_base_framework_components_close ( framework=0xffffffff7f14bc08 <orte_oob_base_framework>, skip=0x0) at ../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:66 #17 0xffffffff7efcaf80 in orte_oob_base_close () at ../../../../openmpi-dev-178-ga16c1e4/orte/mca/oob/base/oob_base_frame.c:112 #18 0xffffffff7ecc0d74 in mca_base_framework_close ( framework=0xffffffff7f14bc08 <orte_oob_base_framework>) at ../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_framework.c:187 #19 0xffffffff7be07858 in rte_finalize () at ../../../../../openmpi-dev-178-ga16c1e4/orte/mca/ess/hnp/ess_hnp_module.c:857 #20 0xffffffff7ef338bc in orte_finalize () at ../../openmpi-dev-178-ga16c1e4/orte/runtime/orte_finalize.c:66 #21 0x000000010000723c in orterun (argc=5, argv=0xffffffff7fffe0d8) at ../../../../openmpi-dev-178-ga16c1e4/orte/tools/orterun/orterun.c:1103 #22 0x0000000100003e80 in main (argc=5, argv=0xffffffff7fffe0d8) at ../../../../openmpi-dev-178-ga16c1e4/orte/tools/orterun/main.c:13 (gdb) Hopefully the problem has nothing to do with my program. I would be grateful if somebody (Oscar?) can fix the problem. Thank you very much for any help in advance. Kind regards Siegmar
MsgSendRecvMain.java
Description: MsgSendRecvMain.java