Michael,

i'd like to correct what i wrote earlier

in heterogeneous clusters, data is sent "as is" (e.g. no byte swapping) and it is byte swapped when received and only if needed.

with --enable-heterogeneous, MPI_Unpack_external is working, but MPI_Pack_external is broken (e.g. no byte swapping occurs on little endian arch) since we internall use the similar mechanism used to send data. that is a bug and i will work on that.

without --enable-heterogeneous, MPI_Pack_external nor MPI_Unpack_external do any byte swapping and they are both broken. fwiw, it you configure'd with --enable-debug, you would have ran into an assert error (e.g. crash).

i will work on a fix, but it might take some time before it is ready

Cheers,

Gilles
On 2/11/2016 6:16 PM, Gilles Gouaillardet wrote:
Michael,

MPI_Pack_external must convert data to big endian, so it can be dumped into a file, and be read correctly on big and little endianness arch, and with any MPI flavor.

if you use only one MPI library on one arch, or if data is never read/written from/to a file, then it is more efficient to MPI_Pack.

openmpi is optimized and the data is swapped only when needed.
so if your cluster is little endian only, MPI_Send and MPI_Recv will never byte swap data internally. if both ends have different endianness, data is sent in big endian format and byte swapped when received only if needed.
generally speaking, a send/recv requires zero or one byte swap.

fwiw, we previously had a claim that debian nor Ubuntu have a maintainer for openmpi, which would explain why an obsolete version is shipped. I made a few researchs and could not find any evidence openmpi is no more maintained.

Cheers,

Gilles



On Thursday, February 11, 2016, Michael Rezny <michael.re...@monash.edu <mailto:michael.re...@monash.edu>> wrote:

    Hi Gilles,
    thanks for thinking about this in more detail.

    I understand what you are saying, but your comments raise some
    questions in my mind:

    If one is in a homogeneous cluster, is it important that, in the
    case of little-endian, that the data be
    converted to extern32 format (big-endian), only to be always
    converted at the receiving rank
    back to little-endian?

    This would seem to be inefficient, especially if the site has no
    need for external MPI access.

    So, does --enable-heterogeneous do more than put MPI routines
    using "extern32" into straight pass-through?

    Back in the old days of PVM, all messages were converted into
    network order. This had severe performance impacts
    on little-endian clusters.

    So much so that a clever way of getting around this was an
    implementation of "receiver makes right" in which
    all data was sent in the native format of the sending rank. The
    receiving rank analysed the message to determine if
    a conversion was necessary. In those days with Cray format data,
    it could be more complicated than just byte swapping.

    So in essence, how is a balance struck between supporting
    heterogenous architectures and maximum performance
    with codes where message passing performance is critical?

    As a follow up, since I am now at home, this same problem also
    exists with the Ubuntu 15.10 OpenMP packages
    which surprisingly are still at 1.6.5, same as 14.04.

    Again, downloading, building, and using the latest stable version
    of OpenMP solved the problem.

    kindest regards
    Mike


    On 11/02/2016, at 7:31 PM, Gilles Gouaillardet wrote:

    Michael,

    I think it is worst than that ...

    without --enable-heterogeneous, it seems the data is not
    correctly packed
    (e.g. it is not converted to big endian), at least on a x86_64 arch.
    unpack looks broken too, but pack followed by unpack does work.
    that means if you are reading data correctly written in
    external32e format,
    it will not be correctly unpacked.

    with --enable-heterogeneous, it is only half broken
    (I do not know yet whether pack or unpack is broken ...)
    and pack followed by unpack does not work.

    I will double check that tomorrow

    Cheers,

    Gilles

    On Thursday, February 11, 2016, Michael Rezny
    <michael.re...@monash.edu
    <javascript:_e(%7B%7D,'cvml','michael.re...@monash.edu');>> wrote:

        Hi Ralph,
        you are indeed correct. However, many of our users
        have workstations such as me, with OpenMPI provided by
        installing a package.
        So we don't know what has been configured.

        Then we have failures, since, for instance, Ubuntu 14.04 by
        default appears to have been built
        with heterogeneous support! The other (working) machine is a
        large HPC, and it seems OpenMPI was built
        without heterogeneous support.

        Currently we work around the problem for packing and
        unpacking by having a compiler switch
        that will switch between calls to pack/unpack_external and
        pac/unpack.

        It is only now we started to track down what the problem
        actually is.

        kindest regards
        Mike

        On 11 February 2016 at 15:54, Ralph Castain
        <r...@open-mpi.org> wrote:

            Out of curiosity: if both systems are Intel, they why are
            you enabling hetero? You don’t need it in that scenario.

            Admittedly, we do need to fix the bug - just trying to
            understand why you are configuring that way.


            On Feb 10, 2016, at 8:46 PM, Michael Rezny
            <michael.re...@monash.edu> wrote:

            Hi Gilles,
            I can confirm that with a fresh download and build from
            source for OpenMPI 1.10.2
            with --enable-heterogeneous
            the unpacked ints are the wrong endian.

            However, without --enable-heterogeneous, the unpacked
            ints are correct.

            So, this problem still exists in heterogeneous builds
            with OpenMPI version 1.10.2.

            kindest regards
            Mike

            On 11 February 2016 at 14:48, Gilles Gouaillardet
            <gilles.gouaillar...@gmail.com> wrote:

                Michael,

                does your two systems have the same endianness ?

                do you know how openmpi was configure'd on both
                systems ?
                (is --enable-heterogeneous enabled or disabled on
                both systems ?)

                fwiw, openmpi 1.6.5 is old now and no more maintained.
                I strongly encourage you to use openmpi 1.10.2

                Cheers,

                Gilles

                On Thursday, February 11, 2016, Michael Rezny
                <michael.re...@monash.edu> wrote:

                    Hi,
                    I am running Ubuntu 14.04 LTS with OpenMPI 1.6.5
                    and gcc 4.8.4

                    On a single rank program which just packs and
                    unpacks two ints using MPI_Pack_external and
                    MPI_Unpack_external
                    the unpacked ints are in the wrong endian order.

                    However, on a HPC, (not Ubuntu), using OpenMPI
                    1.6.5 and gcc 4.8.4 the unpacked ints are correct.

                    Is it possible to get some assistance to track
                    down what is going on?

                    Here is the output from the program:

                    ~/tests/mpi/Pack test1
                    send data 000004d2 0000162e
                    MPI_Pack_external: 0
                    buffer size: 8
                    MPI_unpack_external: 0
                    recv data d2040000 2e160000

                    And here is the source code:

                    #include <stdio.h>
                    #include <mpi.h>

                    int main(int argc, char *argv[]) {
                      int numRanks, myRank, error;

                      int send_data[2] = {1234, 5678};
                      int recv_data[2];

                      MPI_Aint buffer_size = 1000;
                      char buffer[buffer_size];

                    MPI_Init(&argc, &argv);
                    MPI_Comm_size(MPI_COMM_WORLD, &numRanks);
                    MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

                      printf("send data %08x %08x \n", send_data[0],
                    send_data[1]);

                      MPI_Aint position = 0;
                      error = MPI_Pack_external("external32",
                    (void*) send_data, 2, MPI_INT,
                    buffer, buffer_size, &position);
                    printf("MPI_Pack_external: %d\n", error);

                    printf("buffer size: %d\n", (int) position);

                      position = 0;
                      error = MPI_Unpack_external("external32",
                    buffer, buffer_size, &position,
                    recv_data, 2, MPI_INT);
                    printf("MPI_unpack_external: %d\n", error);

                      printf("recv data %08x %08x \n", recv_data[0],
                    recv_data[1]);

                    MPI_Finalize();

                      return 0;
                    }



                _______________________________________________
                devel mailing list
                de...@open-mpi.org
                Subscription:
                http://www.open-mpi.org/mailman/listinfo.cgi/devel
                Link to this post:
                http://www.open-mpi.org/community/lists/devel/2016/02/18573.php


            _______________________________________________
            devel mailing list
            de...@open-mpi.org
            Subscription:
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            Link to this post:
            http://www.open-mpi.org/community/lists/devel/2016/02/18575.php


            _______________________________________________
            devel mailing list
            de...@open-mpi.org
            Subscription:
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            Link to this post:
            http://www.open-mpi.org/community/lists/devel/2016/02/18576.php


    _______________________________________________
    devel mailing list
    de...@open-mpi.org
    <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');>
    Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
    Link to this post:
    http://www.open-mpi.org/community/lists/devel/2016/02/18579.php



_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/02/18582.php

Reply via email to