On Mar 14, 2014, at 9:29 AM, Bibrak Qamar <bibr...@gmail.com> wrote:

> It works for Open MPI but for MPICH3 I have to comment the dlopen. Is there 
> any way to tell the compiler if its using Open MPI (mpicc) then use dlopen 
> else keep it commented? Or some thing else?

If Open MPI's mpi.h, we have defined OPEN_MPI.  You can therefore use #if 
defined (OPEN_MPI).

> Yes, there are some places where we need to be sync with the internals of the 
> native MPI implementation. These are in section 8.1.2 of MPI 2.1 
> (http://www.mpi-forum.org/docs/mpi-2.1/mpi21-report.pdf). For example the 
> MPI_TAG_UB. For the pure Java devices of MPJ Express we have always used 
> Integer.MAX_VALUE.
> 
> Datatypes?
> 
> MPJ Express uses an internal buffering layer to buffer the user data into a 
> ByteBuffer. In this way for the native device we end up using the MPI_BYTE 
> datatype most of the time. ByteBuffer simplifies matters since it is directly 
> accessible from the native code.

Does that mean you can't do heterogeneous?  (not really a huge deal, since most 
people don't run heterogeneously, but something to consider)

> With our current implementation there is one exception to it i.e. in the 
> Reduce, Allreduce and Reduce_scatter where the native MPI implementation 
> needs to know which Java datatype its going to process. Same goes for MPI.Op

And Accumulate and the other Op-using functions, right?

> On Are your bindings similar in style/signature to ours?

No, they use the real datatypes.

> I checked it and there are differences. MPJ Express (and FastMPJ also) 
> implements the mpiJava 1.2 specifications. There is also MPJ API (this is 
> very close to mpiJava 1.2 API). 
> 
> Example 1: Getting the rank and size of COMM_WORLD
> 
> MPJ Express (the mpiJava 1.2 API):
>  public int Size() throws MPIException;
>  public int Rank() throws MPIException;
> 
> MPJ API:
>  public int size() throws MPJException;
>  public int rank() throws MPJException;
> 
> Open MPI's Java bindings:
>  public final int getRank() throws MPIException;
>  public final int getSize() throws MPIException;

Right -- we *started* with the old ideas, but then made the conscious choice to 
update the Java bindings in a few ways:

- make them more like modern Java conventions (e.g., camel case, use verbs, 
etc.)
- get rid of MPI.OBJECT
- use modern, efficient Java practices
- basically, we didn't want to be bound by any Java decisions that were made 
long ago that aren't necessarily relevant any more
- and to be clear: we couldn't find many existing Java MPI codes, so 
compatibility with existing Java MPI codes was not a big concern

One thing we didn't do was use bounce buffers for small messages, which shows 
up in your benchmarks.  We're considering adding that optimization, and others.

> Example 2: Point-to-Point communication
> 
> MPJ Express (the mpiJava 1.2 API):
>  public void Send(Object buf, int offset, int count, Datatype datatype, int 
> dest, int tag) throws MPIException 
> 
>  public Status Recv(Object buf, int offset, int count, Datatype datatype, 
>       int source, int tag) throws MPIException
> 
> MPJ API:
>  public void send(Object buf, int offset, int count, Datatype datatype, int 
> dest, int tag) throws MPJException;
> 
>  public Status recv(Object buf, int offset, int count, Datatype datatype, int 
> source, int tag) throws MPJException
> 
> Open MPI's Java bindings:
>  public final void send(Object buf, int count, Datatype type, int dest, int 
> tag) throws MPIException
> 
>  public final Status recv(Object buf, int count, Datatype type, int source, 
> int tag) throws MPIException
> 
> Example 3: Collective communication
> 
> MPJ Express (the mpiJava 1.2 API):
>  public void Bcast(Object buf, int offset, int count, Datatype type, int root)
>       throws MPIException;
> 
> MPJ API:
>  public void bcast(Object buffer, int offset, int count, Datatype datatype, 
> int root) throws MPJException;
> 
> Open MPI's Java bindings:
>  public final void bcast(Object buf, int count, Datatype type, int root) 
> throws MPIException;
> 
> 
> I couldn't find which API the Open MPI's Java bindings implement?

Our own.  :-)

> But while reading your README.JAVA.txt and your code I realised that you are 
> trying to avoid buffering overhead by giving the user the flexibility to 
> directly allocate data onto a ByteBuffer using MPI.new<Type>Buffer, hence not 
> following the mpiJava 1.2 specs (for communication operations)?

Correct.

> On Performance Comparison
> 
> Yes this is interesting, I have managed to do two kind of tests: Ping-Pong 
> (Latency and Bandwidth) and Collective Communications (Bcast).
> 
> Attached are graphs and the programs (testcases) that I used. The tests were 
> done using Infiniband, more on the platform here 
> http://www.nust.edu.pk/INSTITUTIONS/Centers/RCMS/AboutUs/facilities/screc/Pages/Resources.aspx
> 
> One reason for Open MPI's java bindings low performance (in the Bandwidth.png 
> graph) is the way the test case was written (Bandwidth_OpenMPi.java). It 
> allocates a total of 16M of byte array and uses the same array in send/recv 
> for each data point (by varying count). 
> 
> This could be mainly because of the following code in mpi_Comm.c (let me know 
> if I am mistaken)
> 
> static void* getArrayPtr(void** bufBase, JNIEnv *env,
>                          jobject buf, int baseType, int offset)
> {
>     switch(baseType)
>     {
>            ... 
>            ...
>           case 1: {
>             jbyte* els = (*env)->GetByteArrayElements(env, buf, NULL);
>             *bufBase = els;
>             return els + offset;
>         }
>            ... 
>            ...
> }
> 
> Get<PrimitiveType>ArrayElements routine every time gets the entire array even 
> if the user wants to send some elements (the count). This might be one reason 
> for Open MPI' Java bindings to advocate for the MPI.new<Type>Buffer. The 
> other reason is naturally the buffering overhead.

Yes.

There's *always* going to be a penalty to pay if you don't use native buffers, 
just due to the nature of Java garbage collection, etc.

> From the above experience, for the bandwidth of Bcast operation, I modified 
> the test case to only allocate as much array as need for that Bcast and took 
> the results. For a fairer comparison between MPJ Express and Open MPI's Java 
> bindings I didn't use the MPI.new<Type>Buffer.

It would be interesting to see how using the native buffers compares, too -- 
i.e., are we correct in advocating for the use of native buffers?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to