​Hello Siegmar and Gilles,
I made a reply where Gilles suggested, but figured I leave a note here in case the other was missed. -Nathan -- Nathaniel Graham HPC-DES Los Alamos National Laboratory ________________________________ From: users <users-boun...@lists.open-mpi.org> on behalf of Gilles Gouaillardet <gilles.gouaillar...@gmail.com> Sent: Monday, August 29, 2016 6:16 AM To: Open MPI Users Subject: Re: [OMPI users] problem with exceptions in Java interface Hi Siegmar, I will review PR 1698 and wait some more feedback from the developers, they might have different views than mine. assuming PR 1698 does what you expect, it does not catch all user errors. for example, if you MPI_Send a buffer that is too short, the exception might be thrown at any time. in the worst case, it will occur in the progress thread and outside of any MPI call, which means it cannot be "converted" into a MPIException. fwiw, we have a way to check buffers, but it requires 1. Open MPI is configure'd with --enable-memchecker and 2. the MPI tasks are ran under valgrind. iirc, valgrind will issue an error message if the buffer is invalid, and the app will crash after (e.g. the MPI subroutine will not return with an error code the end user can "trap") such checks might be easier to make in Java, and resulting errors might be easily made "trappable", but as far as I am concerned 1. this has a runtime overhead 2. this is a new development. let's follow up at https://github.com/open-mpi/ompi/issues/1698 from now Cheers, Gilles On Monday, August 29, 2016, Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de<mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote: Hi Gilles, isn't it possible to pass all exceptions from the Java interface to the calling method? I can live with the current handling of exceptions as well, although some exceptions can be handled within my program and some will break my program even if I want to handle exceptions myself. I understood PR 1698 in the way, that all exceptions can be processed in the user program if the user chooses MPI.ERRORS_RETURN (otherwise this change request wouldn't have been necessary). Nevertheless, if you decide, things are as they are, I'm happy with your decision as well. Kind regards Siegmar Am 29.08.2016 um 10:30 schrieb Gilles Gouaillardet: Siegmar and all, i am puzzled with this error. on one hand, it is caused by an invalid buffer (e.g. buffer size is 1, but user suggests size is 2) so i am fine with current behavior (e.g. java.lang.ArrayIndexOutOfBoundsException is thrown) /* if that was a C program, it would very likely SIGSEGV, e.g. Open MPI does not catch this kind of error when checking params */ on the other hand, Open MPI could be enhanced to check the buffer size, and throw a MPIException in this case. as far as i am concerned, this is a feature request and not a bug. thoughts anyone ? Cheers, Gilles On 8/29/2016 3:48 PM, Siegmar Gross wrote: Hi, I have installed v1.10.3-31-g35ba6a1, openmpi-v2.0.0-233-gb5f0a4f, and openmpi-dev-4691-g277c319 on my "SUSE Linux Enterprise Server 12 (x86_64)" with Sun C 5.14 beta and gcc-6.1.0. In May I had reported a problem with Java execeptions (PR 1698) which had been solved in June (PR 1803). https://github.com/open-mpi/ompi/issues/1698 https://github.com/open-mpi/ompi/pull/1803 Unfortunately the problem still exists or exists once more in all three branches. loki fd1026 112 ompi_info | grep -e "Open MPI repo revision" -e "C compiler absolute" Open MPI repo revision: dev-4691-g277c319 C compiler absolute: /opt/solstudio12.5b/bin/cc loki fd1026 112 mpijavac Exception_2_Main.java warning: [path] bad path element "/usr/local/openmpi-master_64_cc/lib64/shmem.jar": no such file or directory 1 warning loki fd1026 113 mpiexec -np 1 java Exception_2_Main Set error handler for MPI.COMM_WORLD to MPI.ERRORS_RETURN. Call "bcast" with index out-of bounds. Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException at mpi.Comm.bcast(Native Method) at mpi.Comm.bcast(Comm.java:1252) at Exception_2_Main.main(Exception_2_Main.java:22) ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[58548,1],0] Exit code: 1 -------------------------------------------------------------------------- loki fd1026 114 exit loki fd1026 116 ompi_info | grep -e "Open MPI repo revision" -e "C compiler absolute" Open MPI repo revision: v2.0.0-233-gb5f0a4f C compiler absolute: /opt/solstudio12.5b/bin/cc loki fd1026 117 mpijavac Exception_2_Main.java warning: [path] bad path element "/usr/local/openmpi-2.0.1_64_cc/lib64/shmem.jar": no such file or directory 1 warning loki fd1026 118 mpiexec -np 1 java Exception_2_Main Set error handler for MPI.COMM_WORLD to MPI.ERRORS_RETURN. Call "bcast" with index out-of bounds. Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException at mpi.Comm.bcast(Native Method) at mpi.Comm.bcast(Comm.java:1252) at Exception_2_Main.main(Exception_2_Main.java:22) ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[58485,1],0] Exit code: 1 -------------------------------------------------------------------------- loki fd1026 119 exit loki fd1026 107 ompi_info | grep -e "Open MPI repo revision" -e "C compiler absolute" Open MPI repo revision: v1.10.3-31-g35ba6a1 C compiler absolute: /opt/solstudio12.5b/bin/cc loki fd1026 107 mpijavac Exception_2_Main.java loki fd1026 108 mpiexec -np 1 java Exception_2_Main Set error handler for MPI.COMM_WORLD to MPI.ERRORS_RETURN. Call "bcast" with index out-of bounds. Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException at mpi.Comm.bcast(Native Method) at mpi.Comm.bcast(Comm.java:1231) at Exception_2_Main.main(Exception_2_Main.java:22) ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[34400,1],0] Exit code: 1 -------------------------------------------------------------------------- loki fd1026 109 exit I would be grateful, if somebody can fix the problem. Thank you very much for any help in advance. Kind regards Siegmar _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users