Hi Gilles, I am misunderstanding something here. What you are now saying seems, to me, to be at odds with what you said previously.
Assume the situation where both sender and receiver are little-endian, and discussing only MPI_Pack_external, and MPI_Unpack_external Consider case 1 --enable-heterogeneous: In your previous email I understood that "receiver make right" was being implemented So, sender does not byte-swap, and message is sent in (native) little-endian format. Receiver recognises the received message is in little-endian format and since this is also its native format, no byte swap is needed. Consider case 2 --disable-heterogeneous It seems strange, that, in this case, any byte swapping would ever need to occur. One is assuming a homogeneous system and sender and receiver will always be using their native format. i.e, exactly the same as MPI_Pack and MPI_Unpack. kindest regards Mike On 12/02/2016, at 9:25 PM, Gilles Gouaillardet wrote: > Michael, > > byte swapping only occurs if you invoke MPI_Pack_external and > MPI_Unpack_external on little endianness systems. > > MPI_Pack and MPI_Unpack uses the same engine that MPI_Send and MPI_Recv and > this does not involve any byte swapping if both ends have the same endianness. > > Cheers, > > Gilles > > On Friday, February 12, 2016, Michael Rezny <michael.re...@monash.edu> wrote: > Hi, > oh, that is good news! The process is meant to be implementing "receiver > makes right" which is good news for efficiency. > > But, in the second case, without --enable-heterogeneous, are you saying that > on little-endian machines, byte swapping > is meant to always occur? That seems most odd. I would have thought that if > one only wants to work and then to configure > OpenMPI for this mode, then there is no need to check at the receiving end > whether byte-swapping is needed or not. It will be assumed > that both sender and receiver are agreed on the format, whatever it is. On a > homogeneous little-endian HPC cluster one would not want > the extra overhead of two conversions for every packed message. > > Is it possible that the assert has been implemented incorrectly in this case? > > There is absolutely no urgency with regard to a fix. Thanks to your quick > response, we now understand what is causing > the problem and are in the process of implementing a test in ./configure to > determine if the bug is present, and if so, > add a compiler flag to switch to using MPI_Pack and MPI_Unpack. > > It would be good if you would be kind enough to let me know when a fix is > available and I will download, build, > and test it on our application. Then this version can be installed as the > default. > > Once again, many thanks for your prompt and most helpful responses. > > warmest regards > MIke > > On 12/02/2016, at 7:03 PM, Gilles Gouaillardet wrote: > >> Michael, >> >> i'd like to correct what i wrote earlier >> >> in heterogeneous clusters, data is sent "as is" (e.g. no byte swapping) and >> it is byte swapped when received and only if needed. >> >> with --enable-heterogeneous, MPI_Unpack_external is working, but >> MPI_Pack_external is broken >> (e.g. no byte swapping occurs on little endian arch) since we internall use >> the similar mechanism used to send data. that is a bug and i will work on >> that. >> >> without --enable-heterogeneous, MPI_Pack_external nor MPI_Unpack_external do >> any byte swapping and they >> are both broken. fwiw, it you configure'd with --enable-debug, you would >> have ran into an assert error (e.g. crash). >> >> i will work on a fix, but it might take some time before it is ready >> >> Cheers, >> >> Gilles >> On 2/11/2016 6:16 PM, Gilles Gouaillardet wrote: >>> Michael, >>> >>> MPI_Pack_external must convert data to big endian, so it can be dumped into >>> a file, and be read correctly on big and little endianness arch, and with >>> any MPI flavor. >>> >>> if you use only one MPI library on one arch, or if data is never >>> read/written from/to a file, then it is more efficient to MPI_Pack. >>> >>> openmpi is optimized and the data is swapped only when needed. >>> so if your cluster is little endian only, MPI_Send and MPI_Recv will never >>> byte swap data internally. >>> if both ends have different endianness, data is sent in big endian format >>> and byte swapped when received only if needed. >>> generally speaking, a send/recv requires zero or one byte swap. >>> >>> fwiw, we previously had a claim that debian nor Ubuntu have a maintainer >>> for openmpi, which would explain why an obsolete version is shipped. I made >>> a few researchs and could not find any evidence openmpi is no more >>> maintained. >>> >>> Cheers, >>> >>> Gilles >>> >>> >>> >>> On Thursday, February 11, 2016, Michael Rezny <michael.re...@monash.edu> >>> wrote: >>> Hi Gilles, >>> thanks for thinking about this in more detail. >>> >>> I understand what you are saying, but your comments raise some questions in >>> my mind: >>> >>> If one is in a homogeneous cluster, is it important that, in the case of >>> little-endian, that the data be >>> converted to extern32 format (big-endian), only to be always converted at >>> the receiving rank >>> back to little-endian? >>> >>> This would seem to be inefficient, especially if the site has no need for >>> external MPI access. >>> >>> So, does --enable-heterogeneous do more than put MPI routines using >>> "extern32" into straight pass-through? >>> >>> Back in the old days of PVM, all messages were converted into network >>> order. This had severe performance impacts >>> on little-endian clusters. >>> >>> So much so that a clever way of getting around this was an implementation >>> of "receiver makes right" in which >>> all data was sent in the native format of the sending rank. The receiving >>> rank analysed the message to determine if >>> a conversion was necessary. In those days with Cray format data, it could >>> be more complicated than just byte swapping. >>> >>> So in essence, how is a balance struck between supporting heterogenous >>> architectures and maximum performance >>> with codes where message passing performance is critical? >>> >>> As a follow up, since I am now at home, this same problem also exists with >>> the Ubuntu 15.10 OpenMP packages >>> which surprisingly are still at 1.6.5, same as 14.04. >>> >>> Again, downloading, building, and using the latest stable version of OpenMP >>> solved the problem. >>> >>> kindest regards >>> Mike >>> >>> >>> On 11/02/2016, at 7:31 PM, Gilles Gouaillardet wrote: >>> >>>> Michael, >>>> >>>> I think it is worst than that ... >>>> >>>> without --enable-heterogeneous, it seems the data is not correctly packed >>>> (e.g. it is not converted to big endian), at least on a x86_64 arch. >>>> unpack looks broken too, but pack followed by unpack does work. >>>> that means if you are reading data correctly written in external32e format, >>>> it will not be correctly unpacked. >>>> >>>> with --enable-heterogeneous, it is only half broken >>>> (I do not know yet whether pack or unpack is broken ...) >>>> and pack followed by unpack does not work. >>>> >>>> I will double check that tomorrow >>>> >>>> Cheers, >>>> >>>> Gilles >>>> >>>> On Thursday, February 11, 2016, Michael Rezny <michael.re...@monash.edu> >>>> wrote: >>>> Hi Ralph, >>>> you are indeed correct. However, many of our users >>>> have workstations such as me, with OpenMPI provided by installing a >>>> package. >>>> So we don't know what has been configured. >>>> >>>> Then we have failures, since, for instance, Ubuntu 14.04 by default >>>> appears to have been built >>>> with heterogeneous support! The other (working) machine is a large HPC, >>>> and it seems OpenMPI was built >>>> without heterogeneous support. >>>> >>>> Currently we work around the problem for packing and unpacking by having a >>>> compiler switch >>>> that will switch between calls to pack/unpack_external and pac/unpack. >>>> >>>> It is only now we started to track down what the problem actually is. >>>> >>>> kindest regards >>>> Mike >>>> >>>> On 11 February 2016 at 15:54, Ralph Castain <r...@open-mpi.org> wrote: >>>> Out of curiosity: if both systems are Intel, they why are you enabling >>>> hetero? You don’t need it in that scenario. >>>> >>>> Admittedly, we do need to fix the bug - just trying to understand why you >>>> are configuring that way. >>>> >>>> >>>>> On Feb 10, 2016, at 8:46 PM, Michael Rezny <michael.re...@monash.edu> >>>>> wrote: >>>>> >>>>> Hi Gilles, >>>>> I can confirm that with a fresh download and build from source for >>>>> OpenMPI 1.10.2 >>>>> with --enable-heterogeneous >>>>> the unpacked ints are the wrong endian. >>>>> >>>>> However, without --enable-heterogeneous, the unpacked ints are correct. >>>>> >>>>> So, this problem still exists in heterogeneous builds with OpenMPI >>>>> version 1.10.2. >>>>> >>>>> kindest regards >>>>> Mike >>>>> >>>>> On 11 February 2016 at 14:48, Gilles Gouaillardet >>>>> <gilles.gouaillar...@gmail.com> wrote: >>>>> Michael, >>>>> >>>>> does your two systems have the same endianness ? >>>>> >>>>> do you know how openmpi was configure'd on both systems ? >>>>> (is --enable-heterogeneous enabled or disabled on both systems ?) >>>>> >>>>> fwiw, openmpi 1.6.5 is old now and no more maintained. >>>>> I strongly encourage you to use openmpi 1.10.2 >>>>> >>>>> Cheers, >>>>> >>>>> Gilles >>>>> >>>>> On Thursday, February 11, 2016, Michael Rezny <michael.re...@monash.edu> >>>>> wrote: >>>>> Hi, >>>>> I am running Ubuntu 14.04 LTS with OpenMPI 1.6.5 and gcc 4.8.4 >>>>> >>>>> On a single rank program which just packs and unpacks two ints using >>>>> MPI_Pack_external and MPI_Unpack_external >>>>> the unpacked ints are in the wrong endian order. >>>>> >>>>> However, on a HPC, (not Ubuntu), using OpenMPI 1.6.5 and gcc 4.8.4 the >>>>> unpacked ints are correct. >>>>> >>>>> Is it possible to get some assistance to track down what is going on? >>>>> >>>>> Here is the output from the program: >>>>> >>>>> ~/tests/mpi/Pack test1 >>>>> send data 000004d2 0000162e >>>>> MPI_Pack_external: 0 >>>>> buffer size: 8 >>>>> MPI_unpack_external: 0 >>>>> recv data d2040000 2e160000 >>>>> >>>>> And here is the source code: >>>>> >>>>> #include <stdio.h> >>>>> #include <mpi.h> >>>>> >>>>> int main(int argc, char *argv[]) { >>>>> int numRanks, myRank, error; >>>>> >>>>> int send_data[2] = {1234, 5678}; >>>>> int recv_data[2]; >>>>> >>>>> MPI_Aint buffer_size = 1000; >>>>> char buffer[buffer_size]; >>>>> >>>>> MPI_Init(&argc, &argv); >>>>> MPI_Comm_size(MPI_COMM_WORLD, &numRanks); >>>>> MPI_Comm_rank(MPI_COMM_WORLD, &myRank); >>>>> >>>>> printf("send data %08x %08x \n", send_data[0], send_data[1]); >>>>> >>>>> MPI_Aint position = 0; >>>>> error = MPI_Pack_external("external32", (void*) send_data, 2, MPI_INT, >>>>> buffer, buffer_size, &position); >>>>> printf("MPI_Pack_external: %d\n", error); >>>>> >>>>> printf("buffer size: %d\n", (int) position); >>>>> >>>>> position = 0; >>>>> error = MPI_Unpack_external("external32", buffer, buffer_size, >>>>> &position, >>>>> recv_data, 2, MPI_INT); >>>>> printf("MPI_unpack_external: %d\n", error); >>>>> >>>>> printf("recv data %08x %08x \n", recv_data[0], recv_data[1]); >>>>> >>>>> MPI_Finalize(); >>>>> >>>>> return 0; >>>>> } >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/devel/2016/02/18573.php >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/devel/2016/02/18575.php >>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/devel/2016/02/18576.php >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/devel/2016/02/18579.php >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2016/02/18582.php >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/02/18591.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/02/18593.php