Folks, currently, there is a global lock in the OpenMPI glue for ROMIO
that can causes some hangs with MPI_THREAD_MULTIPLE I could not find such global locks in MPICH, nor a good rationale for having then in OpenMPI. At this stage, I can only consider this is legacy stuff that should simply be removed. Does anyone remember why we used global locks in the first place ? Is there any rationale for using global locks (or even a finer grain locking) in the ROMIO glue ? Cheers, Gilles ---------- Forwarded message ---------- From: *Sebastian Rettenberger* <rette...@in.tum.de> List-Post: devel@lists.open-mpi.org Date: Tuesday, March 29, 2016 Subject: [OMPI users] Collective MPI-IO + MPI_THREAD_MULTIPLE To: Open MPI Users <us...@open-mpi.org> Hi, thanks, the patch works for me. I will do some further tests and report back if I find another problem. Best regards, Sebastian On 03/25/2016 01:58 AM, Gilles Gouaillardet wrote: > Sebastian, > > at first glance, the global lock in romio glue is not necessary. > > feel free to give the attached patch a try > (it works with your example, and i made no further testing) > > Cheers, > > Gilles > > > On 3/25/2016 9:26 AM, Gilles Gouaillardet wrote: > >> Sebastian, >> >> thanks for the info. >> >> bottom line, the global lock is in the OpenMPI glue for ROMIO. >> >> i will check what kind of locking (if any) is done in mpich >> >> Cheers, >> >> Gilles >> >> On 3/24/2016 11:30 PM, Sebastian Rettenberger wrote: >> >>> Hi, >>> >>> I tested this on my desktop machine. Thus, one node, two tasks. >>> It deadlock appears on the local file system and on the nfs mount. >>> >>> The MPICH version I tested was 3.2. >>> >>> However, as far as I know, locking is part of the MPI library and not >>> ROMIO. >>> >>> Best regards, >>> Sebastian >>> >>> On 03/24/2016 03:19 PM, Gilles Gouaillardet wrote: >>> >>>> Sebastian, >>>> >>>> in openmpi 1.10, the default io component is romio from mpich 3.0.4. >>>> >>>> how many tasks, how many nodes and which file system are you running >>>> on ? >>>> >>>> Cheers, >>>> >>>> Gilles >>>> >>>> On Thursday, March 24, 2016, Sebastian Rettenberger >>>> <rette...@in.tum.de> >>>> wrote: >>>> >>>> Hi, >>>>> >>>>> I tried to run the attached program with OpenMPI. It works well >>>>> with MPICH >>>>> and Intel MPI but I get a deadlock when using OpenMPI. >>>>> I am using OpenMPI 1.10.0 with support for MPI_THREAD_MULTIPLE. >>>>> >>>>> It seems like ROMIO uses global locks in OpenMPI which is a problem if >>>>> multiple threads want to do collective I/O. >>>>> >>>>> Any idea how one can get around this issue? >>>>> >>>>> Best regards, >>>>> Sebastian >>>>> >>>>> -- >>>>> Sebastian Rettenberger, M.Sc. >>>>> Technische Universität München >>>>> Department of Informatics >>>>> Chair of Scientific Computing >>>>> Boltzmannstrasse 3, 85748 Garching, Germany >>>>> http://www5.in.tum.de/ >>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2016/03/28819.php >>>> >>>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this >>> post:http://www.open-mpi.org/community/lists/users/2016/03/28820.php >>> >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28825.php >> > > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28826.php > > -- Sebastian Rettenberger, M.Sc. Technische Universität München Department of Informatics Chair of Scientific Computing Boltzmannstrasse 3, 85748 Garching, Germany http://www5.in.tum.de/
smime.p7s
Description: S/MIME cryptographic signature