I’m afraid we’ll have to get someone from the Forum to interpret (Howard is a 
member as well), but here is what I see just below that, in the description 
section:

The type signature associated with sendcounts[j], sendtype at process i must be 
equal to the type signature associated with recvcounts[i], recvtype at process 
j. This implies that the amount of data sent must be equal to the amount of 
data received, pairwise between every pair of processes


> On Apr 7, 2015, at 9:56 AM, Hamidreza Anvari <hr.anv...@gmail.com> wrote:
> 
> Hello,
> 
> Thanks for your description.
> I'm currently doing allToAll() prior to allToAllV(), to communicate length of 
> expected messages.
> .
> BUT, I still strongly believe that the right implementation of this method is 
> something that I expected earlier!
> If you check the MPI specification here:
> 
> http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf 
> <http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf>
> Page 170
> Line 14
> 
> It is mentioned that "... the number of elements that CAN be received...". 
> which implies that the actual received message may have shorter length.
> 
> While in cases where it is mandatory to have same value, the modal "MUST" is 
> used. for example at page 171 Line 1, it is mentioned that "... sendtype at 
> process i MUST be equal to the type signature ...".
> 
> SO, I would expect that any consistent implementation of MPI specification 
> handle this message length matching by itself, as I asked originally.
> 
> Thanks,
> -- HR
> 
> On Tue, Apr 7, 2015 at 6:03 AM, Howard Pritchard <hpprit...@gmail.com 
> <mailto:hpprit...@gmail.com>> wrote:
> Hi HR,
> 
> Sorry for not noticing the receive side earlier, but as Ralph implied earlier
> in this thread, the MPI standard has more strict type matching for collectives
> than for point to point.  Namely, the number of bytes the receiver expects
> to receive from a given sender in the alltoallv must match the number of bytes
> sent by the sender.
> 
> You were just getting lucky with the older open mpi.  The error message
> isn't so great though.  Its likely in the newer open mpi you are using a
> collective algorithm for alltoallv that assumes you're app is obeying the
> standard.  
> 
> You are correct that if the ranks don't know how much data will be sent
> to them from each rank prior to the alltoallv op, you will need to have some
> mechanism for exchanging this info prior to the alltoallv op.
> 
> Howard
> 
> 
> 2015-04-06 23:23 GMT-06:00 Hamidreza Anvari <hr.anv...@gmail.com 
> <mailto:hr.anv...@gmail.com>>:
> Hello,
> 
> If I set the size2 values according to your suggestion, which is the same 
> values as on sending nodes, it works fine.
> But by definition it does not need to be exactly the same as the length of 
> sent data, and it is just a maximum length of expected data to receive. If 
> not, it is inevitable to run a allToAll() first to communicate the data 
> sizes, and then doing the main allToAllV(), which is an expensive unnecessary 
> communication overhead.
> 
> I just created a reproducer in C++ which gives the error under OpenMPI 1.8.4, 
> but runs correctly under OpenMPI 1.5.4 .
> (I've not included the Java version of this reproducer, which I think is not 
> important as current version is enough to reproduce the error. But in case, 
> it is straight forward to convert this code to Java).
> 
> Thanks,
> -- HR
> 
> On Mon, Apr 6, 2015 at 3:03 PM, Ralph Castain <r...@open-mpi.org 
> <mailto:r...@open-mpi.org>> wrote:
> That would imply that the issue is in the underlying C implementation in 
> OMPI, not the Java bindings. The reproducer would definitely help pin it down.
> 
> If you change the size2 values to the ones we sent you, does the program by 
> chance work?
> 
> 
>> On Apr 6, 2015, at 1:44 PM, Hamidreza Anvari <hr.anv...@gmail.com 
>> <mailto:hr.anv...@gmail.com>> wrote:
>> 
>> I'll try that as well.
>> Meanwhile, I found that my c++ code is running fine on a machine running 
>> OpenMPI 1.5.4, but I receive the same error under OpenMPI 1.8.4 for both 
>> Java and C++.
>> 
>> On Mon, Apr 6, 2015 at 2:21 PM, Howard Pritchard <hpprit...@gmail.com 
>> <mailto:hpprit...@gmail.com>> wrote:
>> Hello HR,
>> 
>> Thanks!  If you have Java 1.7 installed on your system would you mind trying 
>> to test against that version too?
>> 
>> Thanks,
>> 
>> Howard
>> 
>> 
>> 2015-04-06 13:09 GMT-06:00 Hamidreza Anvari <hr.anv...@gmail.com 
>> <mailto:hr.anv...@gmail.com>>:
>> Hello,
>> 
>> 1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.
>> 
>> 2. I have used the following configuration for making OpenMPI:
>> ./configure --enable-mpi-java 
>> --with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
>>  
>> --with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
>>  --prefix="/users/hamidreza/openmpi-1.8.4"
>> 
>> make all install
>> 
>> 3. As a logical point of view, size2 is the maximum expected data to 
>> receive, which in turn might be less that this maximum. 
>> 
>> 4. I will try to prepare a working reproducer of my error and send it to you.
>> 
>> Thanks,
>> -- HR
>> 
>> On Mon, Apr 6, 2015 at 10:46 AM, Ralph Castain <r...@open-mpi.org 
>> <mailto:r...@open-mpi.org>> wrote:
>> I’ve talked to the folks who wrote the Java bindings. One possibility we 
>> identified is that there may be an error in your code when you did the 
>> translation
>> 
>>> My immediate thought is that each process can not receive more elements 
>>> than it was sent to them. That's the reason of truncation error.
>>> 
>>> These are the correct values:
>>> 
>>> rank 0 - size2: 2,2,1,1
>>> rank 1 - size2: 1,1,1,1
>>> rank 2 - size2: 0,1,1,2
>>> rank 3 - size2: 2,1,2,1
>> 
>> Can you check your code to see if perhaps the values you are passing didn’t 
>> get translated correctly from your C++ version to the Java version?
>> 
>> 
>> 
>>> On Apr 6, 2015, at 5:03 AM, Howard Pritchard <hpprit...@gmail.com 
>>> <mailto:hpprit...@gmail.com>> wrote:
>>> 
>>> Hello HR,
>>> 
>>> It would also be useful to know which java version you are using, as well
>>> as the configure options used when building open mpi.
>>> 
>>> Thanks,
>>> 
>>> Howard
>>> 
>>> 
>>> 
>>> 2015-04-05 19:10 GMT-06:00 Ralph Castain <r...@open-mpi.org 
>>> <mailto:r...@open-mpi.org>>:
>>> If not too much trouble, can you extract just the alltoallv portion and 
>>> provide us with a small reproducer?
>>> 
>>> 
>>>> On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari <hr.anv...@gmail.com 
>>>> <mailto:hr.anv...@gmail.com>> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> I am converting an existing MPI program in C++ to Java using OpenMPI 1.8.4,
>>>> At some point I have a allToAllv() code which works fine in C++ but 
>>>> receives error in Java version:
>>>> 
>>>> MPI.COMM_WORLD.allToAllv(data, subpartition_size, subpartition_offset, 
>>>> MPI.INT <http://mpi.int/>,
>>>> data2,subpartition_size2,subpartition_offset2,MPI.INT <http://mpi.int/>);
>>>> 
>>>> Error:
>>>> *** An error occurred in MPI_Alltoallv
>>>> *** reported by process [3621322753,9223372036854775811]
>>>> *** on communicator MPI_COMM_WORLD
>>>> *** MPI_ERR_TRUNCATE: message truncated
>>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>>>> ***    and potentially your MPI job)
>>>> 3 more processes have sent help message help-mpi-errors.txt / 
>>>> mpi_errors_are_fatal
>>>> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error 
>>>> messages
>>>> 
>>>> Here are the values for parameters:
>>>> 
>>>> data.length = 5
>>>> data2.length = 20
>>>> 
>>>> ---------- Rank 0 of 4 ----------
>>>> subpartition_offset:0,2,3,3,
>>>> subpartition_size:2,1,0,2,
>>>> subpartition_offset2:0,5,10,15,
>>>> subpartition_size2:5,5,5,5,
>>>> ----------
>>>> ---------- Rank 1 of 4 ----------
>>>> subpartition_offset:0,2,3,4,
>>>> subpartition_size:2,1,1,1,
>>>> subpartition_offset2:0,5,10,15,
>>>> subpartition_size2:5,5,5,5,
>>>> ----------
>>>> ---------- Rank 2 of 4 ----------
>>>> subpartition_offset:0,1,2,3,
>>>> subpartition_size:1,1,1,2,
>>>> subpartition_offset2:0,5,10,15,
>>>> subpartition_size2:5,5,5,5,
>>>> ----------
>>>> ---------- Rank 3 of 4 ----------
>>>> subpartition_offset:0,1,2,4,
>>>> subpartition_size:1,1,2,1,
>>>> subpartition_offset2:0,5,10,15,
>>>> subpartition_size2:5,5,5,5,
>>>> ----------
>>>> 
>>>> Again, this is a code which works in C++ version.
>>>> 
>>>> Any help or advice is greatly appreciated.
>>>> 
>>>> Thanks,
>>>> -- HR
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2015/04/26610.php 
>>>> <http://www.open-mpi.org/community/lists/users/2015/04/26610.php>
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/04/26613.php 
>>> <http://www.open-mpi.org/community/lists/users/2015/04/26613.php>
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/04/26615.php 
>>> <http://www.open-mpi.org/community/lists/users/2015/04/26615.php>
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/04/26616.php 
>> <http://www.open-mpi.org/community/lists/users/2015/04/26616.php>
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/04/26617.php 
>> <http://www.open-mpi.org/community/lists/users/2015/04/26617.php>
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/04/26620.php 
>> <http://www.open-mpi.org/community/lists/users/2015/04/26620.php>
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/04/26622.php 
>> <http://www.open-mpi.org/community/lists/users/2015/04/26622.php>
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26623.php 
> <http://www.open-mpi.org/community/lists/users/2015/04/26623.php>
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26631.php 
> <http://www.open-mpi.org/community/lists/users/2015/04/26631.php>
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26634.php 
> <http://www.open-mpi.org/community/lists/users/2015/04/26634.php>
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26637.php

Reply via email to