I can’t speak to the packing question, but I can say that we have indeed 
confirmed the lack of maintenance on OMPI for Debian/Ubuntu and are working to 
resolve the problem.

> On Feb 11, 2016, at 1:16 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@gmail.com> wrote:
> 
> Michael,
> 
> MPI_Pack_external must convert data to big endian, so it can be dumped into a 
> file, and be read correctly on big and little endianness arch, and with any 
> MPI flavor.
> 
> if you use only one MPI library on one arch, or if data is never read/written 
> from/to a file, then it is more efficient to MPI_Pack.
> 
> openmpi is optimized and the data is swapped only when needed.
> so if your cluster is little endian only, MPI_Send and MPI_Recv will never 
> byte swap data internally.
> if both ends have different endianness, data is sent in big endian format and 
> byte swapped when received only if needed.
> generally speaking, a send/recv requires zero or one byte swap.
> 
> fwiw, we previously had a claim that debian nor Ubuntu have a maintainer for 
> openmpi, which would explain why an obsolete version is shipped. I made a few 
> researchs and could not find any evidence openmpi is no more maintained.
> 
> Cheers,
> 
> Gilles
> 
> 
> 
> On Thursday, February 11, 2016, Michael Rezny <michael.re...@monash.edu 
> <mailto:michael.re...@monash.edu>> wrote:
> Hi Gilles,
> thanks for thinking about this in more detail.
> 
> I understand what you are saying, but your comments raise some questions in 
> my mind:
> 
> If one is in a homogeneous cluster, is it important that, in the case of 
> little-endian, that the data be
> converted to extern32 format (big-endian), only to be always converted at the 
> receiving rank
> back to little-endian?
> 
> This would seem to be inefficient, especially if the site has no need for 
> external MPI access.
> 
> So, does --enable-heterogeneous do more than put MPI routines using 
> "extern32" into straight pass-through?
> 
> Back in the old days of PVM, all messages were converted into network order. 
> This had severe performance impacts
> on little-endian clusters.
> 
> So much so that a clever way of getting around this was an implementation of 
> "receiver makes right" in which
> all data was sent in the native format of the sending rank. The receiving 
> rank analysed the message to determine if
> a conversion was necessary. In those days with Cray format data, it could be 
> more complicated than just byte swapping.
> 
> So in essence, how is a balance struck between supporting heterogenous 
> architectures and maximum performance
> with codes where message passing performance is critical?
> 
> As a follow up, since I am now at home, this same problem also exists with 
> the Ubuntu 15.10 OpenMP packages
> which surprisingly are still at 1.6.5, same as 14.04.
> 
> Again, downloading, building, and using the latest stable version of OpenMP 
> solved the problem.
> 
> kindest regards
> Mike
> 
> 
> On 11/02/2016, at 7:31 PM, Gilles Gouaillardet wrote:
> 
>> Michael,
>> 
>> I think it is worst than that ...
>> 
>> without --enable-heterogeneous, it seems the data is not correctly packed
>> (e.g. it is not converted to big endian), at least on a x86_64 arch.
>> unpack looks broken too, but pack followed by unpack does work.
>> that means if you are reading data correctly written in external32e format,
>> it will not be correctly unpacked.
>> 
>> with --enable-heterogeneous, it is only half broken
>> (I do not know yet whether pack or unpack is broken ...)
>> and pack followed by unpack does not work.
>> 
>> I will double check that tomorrow
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Thursday, February 11, 2016, Michael Rezny <michael.re...@monash.edu <>> 
>> wrote:
>> Hi Ralph,
>> you are indeed correct. However, many of our users
>> have workstations such as me, with OpenMPI provided by installing a package.
>> So we don't know what has been configured.
>> 
>> Then we have failures, since, for instance, Ubuntu 14.04 by default appears 
>> to have been built
>> with heterogeneous support! The other (working) machine is a large HPC, and 
>> it seems OpenMPI was built
>> without heterogeneous support.
>> 
>> Currently we work around the problem for packing and unpacking by having a 
>> compiler switch
>> that will switch between calls to pack/unpack_external and pac/unpack.
>> 
>> It is only now we started to track down what the problem actually is.
>> 
>> kindest regards
>> Mike
>> 
>> On 11 February 2016 at 15:54, Ralph Castain <r...@open-mpi.org <>> wrote:
>> Out of curiosity: if both systems are Intel, they why are you enabling 
>> hetero? You don’t need it in that scenario.
>> 
>> Admittedly, we do need to fix the bug - just trying to understand why you 
>> are configuring that way.
>> 
>> 
>>> On Feb 10, 2016, at 8:46 PM, Michael Rezny <michael.re...@monash.edu <>> 
>>> wrote:
>>> 
>>> Hi Gilles,
>>> I can confirm that with a fresh download and build from source for OpenMPI 
>>> 1.10.2
>>> with --enable-heterogeneous
>>> the unpacked ints are the wrong endian.
>>> 
>>> However, without --enable-heterogeneous, the unpacked ints are correct.
>>> 
>>> So, this problem still exists in heterogeneous builds with OpenMPI version 
>>> 1.10.2.
>>> 
>>> kindest regards
>>> Mike
>>> 
>>> On 11 February 2016 at 14:48, Gilles Gouaillardet 
>>> <gilles.gouaillar...@gmail.com <>> wrote:
>>> Michael,
>>> 
>>> does your two systems have the same endianness ?
>>> 
>>> do you know how openmpi was configure'd on both systems ?
>>> (is --enable-heterogeneous enabled or disabled on both systems ?)
>>> 
>>> fwiw, openmpi 1.6.5 is old now and no more maintained.
>>> I strongly encourage you to use openmpi 1.10.2
>>> 
>>> Cheers,
>>> 
>>> Gilles
>>> 
>>> On Thursday, February 11, 2016, Michael Rezny <michael.re...@monash.edu <>> 
>>> wrote:
>>> Hi,
>>> I am running Ubuntu 14.04 LTS with OpenMPI 1.6.5 and gcc 4.8.4
>>> 
>>> On a single rank program which just packs and unpacks two ints using 
>>> MPI_Pack_external and MPI_Unpack_external
>>> the unpacked ints are in the wrong endian order.
>>> 
>>> However, on a HPC, (not Ubuntu), using OpenMPI 1.6.5 and gcc 4.8.4 the 
>>> unpacked ints are correct.
>>> 
>>> Is it possible to get some assistance to track down what is going on?
>>> 
>>> Here is the output from the program:
>>> 
>>>  ~/tests/mpi/Pack test1
>>> send data 000004d2 0000162e 
>>> MPI_Pack_external: 0
>>> buffer size: 8
>>> MPI_unpack_external: 0
>>> recv data d2040000 2e160000 
>>> 
>>> And here is the source code:
>>> 
>>> #include <stdio.h>
>>> #include <mpi.h>
>>> 
>>> int main(int argc, char *argv[]) {
>>>   int numRanks, myRank, error;
>>> 
>>>   int send_data[2] = {1234, 5678};
>>>   int recv_data[2];
>>> 
>>>   MPI_Aint buffer_size = 1000;
>>>   char buffer[buffer_size];
>>> 
>>>   MPI_Init(&argc, &argv);
>>>   MPI_Comm_size(MPI_COMM_WORLD, &numRanks);
>>>   MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
>>> 
>>>   printf("send data %08x %08x \n", send_data[0], send_data[1]);
>>> 
>>>   MPI_Aint position = 0;
>>>   error = MPI_Pack_external("external32", (void*) send_data, 2, MPI_INT,
>>>           buffer, buffer_size, &position);
>>>   printf("MPI_Pack_external: %d\n", error);
>>> 
>>>   printf("buffer size: %d\n", (int) position);
>>> 
>>>   position = 0;
>>>   error = MPI_Unpack_external("external32", buffer, buffer_size, &position,
>>>           recv_data, 2, MPI_INT);
>>>   printf("MPI_unpack_external: %d\n", error);
>>> 
>>>   printf("recv data %08x %08x \n", recv_data[0], recv_data[1]);
>>> 
>>>   MPI_Finalize();
>>> 
>>>   return 0;
>>> }
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org <>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2016/02/18573.php 
>>> <http://www.open-mpi.org/community/lists/devel/2016/02/18573.php>
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org <>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2016/02/18575.php 
>>> <http://www.open-mpi.org/community/lists/devel/2016/02/18575.php>
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/02/18576.php 
>> <http://www.open-mpi.org/community/lists/devel/2016/02/18576.php>
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/02/18579.php 
>> <http://www.open-mpi.org/community/lists/devel/2016/02/18579.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/02/18582.php 
> <http://www.open-mpi.org/community/lists/devel/2016/02/18582.php>

Reply via email to