Hi,

There is an error in the program.
First you declare a 256 char buffer (BUF_SIZE = 256).

When it is executed the line 96:
    buffer = (MPI.getProcessorName()).toCharArray();

The buffer length becomes less than 256.
So, when the next iteration is executed, at line 92 you could get an exception, because you want to receive 256 chars, but the buffer length is less than 256.

status = MPI.COMM_WORLD.recv (buffer, BUF_SIZE, MPI.CHAR, 0, MPI.ANY_TAG);

I think this program worked because of causality.
The lines 96-98 should be replaced by:

    char name[] = MPI.getProcessorName().toCharArray();
    System.out.printf("message length: %d  message: %s\n",
                      name.length, new String(name));
    MPI.COMM_WORLD.send(name, name.length, MPI.CHAR, 0, MSGTAG);

Now you won't lose the 256 char buffer.

Also, the call to
    String.valueOf(buffer)
should be replaced by
    String.valueOf(buffer, 0, num)
because the buffer content could be trash from the position 'num'.

Regards,
Oscar

El 29/10/14 16:16, Siegmar Gross escribió:
Hi,

today a tested some small Java programs with openmpi-dev-178-ga16c1e4.
One program throws an exception ArrayIndexOutOfBoundsException. The
program worked fine in older MPI versions, e.g., openmpi-1.8.2a1r31804.


tyr java 138 mpiexec -np 2 java MsgSendRecvMain

Now 1 process sends its greetings.

Greetings from process 1:
   message tag:    3
   message length: 26
   message:
tyr.informatik.hs-fulda.de???????????????????????????????????????????????????????????????????????????????
??????????????????????????????????????????????????????????????????????????????

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
         at mpi.Comm.recv(Native Method)
         at mpi.Comm.recv(Comm.java:391)
         at MsgSendRecvMain.main(MsgSendRecvMain.java:92)
...



The exception happens also on my Linux box.

linpc1 java 102 mpijavac MsgSendRecvMain.java
linpc1 java 103 mpiexec -np 2 java MsgSendRecvMain

Now 1 process sends its greetings.

Greetings from process 1:
   message tag:    3
   message length: 6
   message:        linpc1?????%???%?????%?f?%?%???$??????????

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
         at mpi.Comm.recv(Native Method)
         at mpi.Comm.recv(Comm.java:391)
         at MsgSendRecvMain.main(MsgSendRecvMain.java:92)
...



tyr java 139 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
...
(gdb) run -np 2 java MsgSendRecvMain
Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 2 java 
MsgSendRecvMain
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]

Now 1 process sends its greetings.

Greetings from process 1:
   message tag:    3
   message length: 26
   message:        tyr.informatik.hs-fulda.de

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
         at mpi.Comm.recv(Native Method)
         at mpi.Comm.recv(Comm.java:391)
         at MsgSendRecvMain.main(MsgSendRecvMain.java:92)
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus 
causing
the job to be terminated. The first process to do so was:

   Process name: [[61564,1],1]
   Exit code:    1
--------------------------------------------------------------------------
[LWP    2         exited]
[New Thread 2        ]
[Switching to Thread 1 (LWP 1)]
sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to satisfy 
query
(gdb) bt
#0  0xffffffff7f6173d0 in rtld_db_dlactivity () from /usr/lib/sparcv9/ld.so.1
#1  0xffffffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
#2  0xffffffff7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1
#3  0xffffffff7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1
#4  0xffffffff7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1
#5  0xffffffff7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1
#6  0xffffffff7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1
#7  0xffffffff7f61db0c in dlclose () from /usr/lib/sparcv9/ld.so.1
#8  0xffffffff7ec87ca0 in vm_close ()
    from /usr/local/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0
#9  0xffffffff7ec85274 in lt_dlclose ()
    from /usr/local/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0
#10 0xffffffff7ecaa5dc in ri_destructor (obj=0x100187b70)
     at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_component_repository.c:382
#11 0xffffffff7eca8fd8 in opal_obj_run_destructors (object=0x100187b70)
     at ../../../../openmpi-dev-178-ga16c1e4/opal/class/opal_object.h:446
#12 0xffffffff7eca9eac in mca_base_component_repository_release (
     component=0xffffffff7b1236f0 <mca_oob_tcp_component>)
     at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_component_repository.c:240
#13 0xffffffff7ecac17c in mca_base_component_unload (
     component=0xffffffff7b1236f0 <mca_oob_tcp_component>, output_id=-1)
     at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:47
#14 0xffffffff7ecac210 in mca_base_component_close (
     component=0xffffffff7b1236f0 <mca_oob_tcp_component>, output_id=-1)
     at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:60
#15 0xffffffff7ecac2e4 in mca_base_components_close (output_id=-1,
     components=0xffffffff7f14bc58 <orte_oob_base_framework+80>, skip=0x0)
     at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:86
#16 0xffffffff7ecac24c in mca_base_framework_components_close (
     framework=0xffffffff7f14bc08 <orte_oob_base_framework>, skip=0x0)
     at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:66
#17 0xffffffff7efcaf80 in orte_oob_base_close ()
     at 
../../../../openmpi-dev-178-ga16c1e4/orte/mca/oob/base/oob_base_frame.c:112
#18 0xffffffff7ecc0d74 in mca_base_framework_close (
     framework=0xffffffff7f14bc08 <orte_oob_base_framework>)
     at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_framework.c:187
#19 0xffffffff7be07858 in rte_finalize ()
     at 
../../../../../openmpi-dev-178-ga16c1e4/orte/mca/ess/hnp/ess_hnp_module.c:857
#20 0xffffffff7ef338bc in orte_finalize ()
     at ../../openmpi-dev-178-ga16c1e4/orte/runtime/orte_finalize.c:66
#21 0x000000010000723c in orterun (argc=5, argv=0xffffffff7fffe0d8)
     at ../../../../openmpi-dev-178-ga16c1e4/orte/tools/orterun/orterun.c:1103
#22 0x0000000100003e80 in main (argc=5, argv=0xffffffff7fffe0d8)
     at ../../../../openmpi-dev-178-ga16c1e4/orte/tools/orterun/main.c:13
(gdb)


Hopefully the problem has nothing to do with my program.
I would be grateful if somebody (Oscar?) can fix the
problem. Thank you very much for any help in advance.


Kind regards

Siegmar


_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/10/25641.php

Reply via email to