Re: [OMPI users] Deadlock on large numbers of processors

Rolf Vandevaart Tue, 9 Dec 2008 09:56:51 -0500

The current version of Open MPI installed on ranger is 1.3a1r19685 whichis from early October. This version has a fix for ticket #1378. Ticket#1449 is not an issue is this case because each node has 16 processorsand #1449 is for larger SMPs.

However, I am wondering if this is because of tickethttps://svn.open-mpi.org/trac/ompi/ticket/1468 which was not yet fixedin the version running on ranger.

As was suggested earlier, running without the sm btl would be a clue ifthis is the problem.


mpirun --mca btl ^sm a.out

Another way to potentially work around the issue is to increase the sizeof the shared memory backing file.


mpirun --mca 1073741824 -mca mpool_sm_max_size 1073741824 a.out

We will also work with TACC to get an upgraded version of Open MPI 1.3on there.


Let us know what you find.

Rolf


On 12/09/08 08:05, Lenny Verkhovsky wrote:

also see https://svn.open-mpi.org/trac/ompi/ticket/1449

On 12/9/08, *Lenny Verkhovsky* <lenny.verkhov...@gmail.com<mailto:lenny.verkhov...@gmail.com>> wrote:


    maybe it's related to https://svn.open-mpi.org/trac/ompi/ticket/1378  ??


    On 12/5/08, *Justin* <luitj...@cs.utah.edu
    <mailto:luitj...@cs.utah.edu>> wrote:

        The reason i'd like to disable these eager buffers is to help
        detect the deadlock better.  I would not run with this for a
        normal run but it would be useful for debugging.  If the
        deadlock is indeed due to our code then disabling any shared

buffers or eager sends would make that deadlock reproduceable.In addition we might be able to lower the number of processors

        down.  Right now determining which processor is deadlocks when
        we are using 8K cores and each processor has hundreds of
        messages sent out would be quite difficult.

        Thanks for your suggestions,
        Justin

        Brock Palen wrote:

            OpenMPI has differnt eager limits for all the network types,
            on your system run:

            ompi_info --param btl all

            and look for the eager_limits
            You can set these values to 0 using the syntax I showed you
            before. That would disable eager messages.
            There might be a better way to disable eager messages.
            Not sure why you would want to disable them, they are there
            for performance.

            Maybe you would still see a deadlock if every message was
            below the threshold. I think there is a limit of the number
            of eager messages a receving cpus will accept. Not sure
            about that though.  I still kind of doubt it though.

            Try tweaking your buffer sizes,  make the openib  btl eager
            limit the same as shared memory. and see if you get locks up
            between hosts and not just shared memory.

            Brock Palen
            www.umich.edu/~brockp <http://www.umich.edu/~brockp>
            Center for Advanced Computing
            bro...@umich.edu <mailto:bro...@umich.edu>
            (734)936-1985



            On Dec 5, 2008, at 2:10 PM, Justin wrote:

                Thank you for this info.  I should add that our code
                tends to post a lot of sends prior to the other side
                posting receives.  This causes a lot of unexpected
                messages to exist.  Our code explicitly matches up all
                tags and processors (that is we do not use MPI wild
                cards).  If we had a dead lock I would think we would
                see it regardless of weather or not we cross the
                roundevous threshold.  I guess one way to test this
                would be to to set this threshold to 0.  If it then dead
                locks we would likely be able to track down the
                deadlock.  Are there any other parameters we can send
                mpi that will turn off buffering?

                Thanks,
                Justin

                Brock Palen wrote:

                    When ever this happens we found the code to have a
                    deadlock.  users never saw it until they cross the
                    eager->roundevous threshold.

                    Yes you can disable shared memory with:

                    mpirun --mca btl ^sm

                    Or you can try increasing the eager limit.

                    ompi_info --param btl sm

                    MCA btl: parameter "btl_sm_eager_limit" (current value:
                                             "4096")

                    You can modify this limit at run time,  I think
                    (can't test it right now) it is just:

                    mpirun --mca btl_sm_eager_limit 40960

                    I think you can also in tweaking these values use
                    env Vars in place of putting it all in the mpirun line:

                    export OMPI_MCA_btl_sm_eager_limit=40960

                    See:
                    http://www.open-mpi.org/faq/?category=tuning


                    Brock Palen
                    www.umich.edu/~brockp <http://www.umich.edu/~brockp>
                    Center for Advanced Computing
                    bro...@umich.edu <mailto:bro...@umich.edu>
                    (734)936-1985



                    On Dec 5, 2008, at 12:22 PM, Justin wrote:

                        Hi,

                        We are currently using OpenMPI 1.3 on Ranger for
                        large processor jobs (8K+).  Our code appears to
                        be occasionally deadlocking at random within
                        point to point communication (see stacktrace
                        below).  This code has been tested on many
                        different MPI versions and as far as we know it
                        does not contain a deadlock.  However, in the
                        past we have ran into problems with shared
                        memory optimizations within MPI causing
                        deadlocks.  We can usually avoid these by
                        setting a few environment variables to either
                        increase the size of shared memory buffers or
                        disable shared memory optimizations all
                        together.   Does OpenMPI have any known
                        deadlocks that might be causing our deadlocks?
                         If are there any work arounds?  Also how do we
                        disable shared memory within OpenMPI?

                        Here is an example of where processors are hanging:

                        #0  0x00002b2df3522683 in
                        mca_btl_sm_component_progress () from
                        
/opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so
                        #1  0x00002b2df2cb46bf in mca_bml_r2_progress ()
                        from
                        
/opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so
                        #2  0x00002b2df0032ea4 in opal_progress () from
                        /opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0
                        #3  0x00002b2ded0d7622 in
                        ompi_request_default_wait_some () from
                        /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
                        #4  0x00002b2ded109e34 in PMPI_Waitsome () from
                        /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0


                        Thanks,
                        Justin
                        _______________________________________________
                        users mailing list
                        us...@open-mpi.org <mailto:us...@open-mpi.org>
                        http://www.open-mpi.org/mailman/listinfo.cgi/users



                    _______________________________________________
                    users mailing list
                    us...@open-mpi.org <mailto:us...@open-mpi.org>
                    http://www.open-mpi.org/mailman/listinfo.cgi/users


                _______________________________________________
                users mailing list
                us...@open-mpi.org <mailto:us...@open-mpi.org>
                http://www.open-mpi.org/mailman/listinfo.cgi/users



            _______________________________________________
            users mailing list
            us...@open-mpi.org <mailto:us...@open-mpi.org>
            http://www.open-mpi.org/mailman/listinfo.cgi/users


        _______________________________________________
        users mailing list
        us...@open-mpi.org <mailto:us...@open-mpi.org>
        http://www.open-mpi.org/mailman/listinfo.cgi/users




------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--

=========================
rolf.vandeva...@sun.com
781-442-3043
=========================

Re: [OMPI users] Deadlock on large numbers of processors

Reply via email to