Here's what I had to do to load the library correctly (we were only using ORTE, 
so substitute "libmpi") - this was called at the beginning of "init":

    /* first, load the required ORTE library */
#if OPAL_WANT_LIBLTDL
    lt_dladvise advise;

    if (lt_dlinit() != 0) {
        fprintf(stderr, "LT_DLINIT FAILED - CANNOT LOAD LIBMRPLUS\n");
        return JNI_FALSE;
    }

#if OPAL_HAVE_LTDL_ADVISE
    /* open the library into the global namespace */
    if (lt_dladvise_init(&advise)) {
        fprintf(stderr, "LT_DLADVISE INIT FAILED - CANNOT LOAD LIBMRPLUS\n");
        return JNI_FALSE;
    }

    if (lt_dladvise_ext(&advise)) {
        fprintf(stderr, "LT_DLADVISE EXT FAILED - CANNOT LOAD LIBMRPLUS\n");
        lt_dladvise_destroy(&advise);
        return JNI_FALSE;
    }

    if (lt_dladvise_global(&advise)) {
        fprintf(stderr, "LT_DLADVISE GLOBAL FAILED - CANNOT LOAD LIBMRPLUS\n");
        lt_dladvise_destroy(&advise);
        return JNI_FALSE;
    }

    /* we don't care about the return value
     * on dlopen - it might return an error
     * because the lib is already loaded,
     * depending on the way we were built
     */
    lt_dlopenadvise("libopen-rte", advise);
    lt_dladvise_destroy(&advise);
#else
    fprintf(stderr, "NO LT_DLADVISE - CANNOT LOAD LIBMRPLUS\n");
    /* need to balance the ltdl inits */
    lt_dlexit();
    /* if we don't have advise, then we are hosed */
    return JNI_FALSE;
#endif
#endif
    /* if dlopen was disabled, then all symbols
     * should have been pulled up into the libraries,
     * so we don't need to do anything as the symbols
     * are already available.
     */

On Mar 12, 2014, at 6:32 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote:

> Check out how we did this with the embedded java bindings in Open MPI; see 
> the comment describing exactly this issue starting here:
> 
>    
> https://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mpi/java/c/mpi_MPI.c#L79
> 
> Feel free to compare MPJ to the OMPI java bindings -- they're shipping in 
> 1.7.4 and have a bunch of improvements in the soon-to-be-released 1.7.5, but 
> you must enable them since they aren't enabled by default:
> 
>    ./configure --enable-mpi-java ...
> 
> FWIW, we found a few places in the Java bindings where it was necessary for 
> the bindings to have some insight into the internals of the MPI 
> implementation.  Did you find the same thing with MPJ Express?
> 
> Are your bindings similar in style/signature to ours?
> 
> 
> 
> On Mar 12, 2014, at 6:40 AM, Bibrak Qamar <bibr...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> I am writing a new device for MPJ Express that uses a native MPI library for 
>> communication. Its based on JNI wrappers like the original mpiJava. The 
>> device works fine with MPICH3  (and MVAPICH2.2). Here is the issue with 
>> loading Open MPI 1.7.4 from MPJ Express.
>> 
>> I have generated the following error message from a simple JNI to MPI 
>> application for clarity purposes and also to regenerate the error easily. I 
>> have attached the app for your consideration.
>> 
>> 
>> [bibrak@localhost JNI_to_MPI]$ mpirun -np 2 java -cp . 
>> -Djava.library.path=/home/bibrak/work/JNI_to_MPI/ simpleJNI_MPI
>> [localhost.localdomain:29086] mca: base: component_find: unable to open 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: 
>> undefined symbol: opal_show_help (ignored)
>> [localhost.localdomain:29085] mca: base: component_find: unable to open 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap: 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_mmap.so: 
>> undefined symbol: opal_show_help (ignored)
>> [localhost.localdomain:29085] mca: base: component_find: unable to open 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: 
>> undefined symbol: opal_shmem_base_framework (ignored)
>> [localhost.localdomain:29086] mca: base: component_find: unable to open 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix: 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_posix.so: 
>> undefined symbol: opal_shmem_base_framework (ignored)
>> [localhost.localdomain:29086] mca: base: component_find: unable to open 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: 
>> undefined symbol: opal_show_help (ignored)
>> --------------------------------------------------------------------------
>> It looks like opal_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during opal_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>> 
>>  opal_shmem_base_select failed
>>  --> Returned value -1 instead of OPAL_SUCCESS
>> --------------------------------------------------------------------------
>> [localhost.localdomain:29085] mca: base: component_find: unable to open 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv: 
>> /home/bibrak/work/installs/OpenMPI_installed/lib/openmpi/mca_shmem_sysv.so: 
>> undefined symbol: opal_show_help (ignored)
>> --------------------------------------------------------------------------
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>> 
>>  opal_init failed
>>  --> Returned value Error (-1) instead of ORTE_SUCCESS
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> It looks like MPI_INIT failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during MPI_INIT; some of which are due to configuration or environment
>> problems.  This failure appears to be an internal failure; here's some
>> additional information (which may only be relevant to an Open MPI
>> developer):
>> 
>>  ompi_mpi_init: ompi_rte_init failed
>>  --> Returned "Error" (-1) instead of "Success" (0)
>> --------------------------------------------------------------------------
>> *** An error occurred in MPI_Init
>> *** on a NULL communicator
>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>> ***    and potentially your MPI job)
>> [localhost.localdomain:29086] Local abort before MPI_INIT completed 
>> successfully; not able to aggregate error messages, and not able to 
>> guarantee that all other processes were killed!
>> --------------------------------------------------------------------------
>> It looks like opal_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during opal_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>> 
>>  opal_shmem_base_select failed
>>  --> Returned value -1 instead of OPAL_SUCCESS
>> --------------------------------------------------------------------------
>> -------------------------------------------------------
>> Primary job  terminated normally, but 1 process returned
>> a non-zero exit code.. Per user-direction, the job has been aborted.
>> -------------------------------------------------------
>> --------------------------------------------------------------------------
>> mpirun detected that one or more processes exited with non-zero status, thus 
>> causing
>> the job to be terminated. The first process to do so was:
>> 
>>  Process name: [[41236,1],1]
>>  Exit code:    1
>> --------------------------------------------------------------------------
>> 
>> 
>> This is a thread that I found where the Open MPI developers were having 
>> issues while integrating mpiJava into their stack.
>> 
>> http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201202.mbox/%3c5ea543bd-a12e-4729-b66a-c746bc789...@open-mpi.org%3E
>> 
>> I don't have better understanding of the internals of Open MPI, so my 
>> question is how to use Open MPI using JNI wrappers? Any guidelines in this 
>> regard?
>> 
>> Thanks
>> Bibrak
>> 
>> <JNI_to_MPI.tar.gz>_______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/03/14335.php
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/03/14337.php

Reply via email to