Folks,

currently, there is a global lock in the OpenMPI glue for ROMIO

that can causes some hangs with MPI_THREAD_MULTIPLE

I could not find such global locks in MPICH, nor a good rationale for
having then in OpenMPI.
At this stage, I can only consider this is legacy stuff that should simply
be removed.

Does anyone remember why we used global locks in the first place ?
Is there any rationale for using global locks (or even a finer grain
locking) in the ROMIO glue ?

Cheers,

Gilles

---------- Forwarded message ----------
From: *Sebastian Rettenberger* <rette...@in.tum.de>
List-Post: devel@lists.open-mpi.org
Date: Tuesday, March 29, 2016
Subject: [OMPI users] Collective MPI-IO + MPI_THREAD_MULTIPLE
To: Open MPI Users <us...@open-mpi.org>


Hi,

thanks, the patch works for me. I will do some further tests and report
back if I find another problem.

Best regards,
Sebastian

On 03/25/2016 01:58 AM, Gilles Gouaillardet wrote:

> Sebastian,
>
> at first glance, the global lock in romio glue is not necessary.
>
> feel free to give the attached patch a try
> (it works with your example, and i made no further testing)
>
> Cheers,
>
> Gilles
>
>
> On 3/25/2016 9:26 AM, Gilles Gouaillardet wrote:
>
>> Sebastian,
>>
>> thanks for the info.
>>
>> bottom line, the global lock is in the OpenMPI glue for ROMIO.
>>
>> i will check what kind of locking (if any) is done in mpich
>>
>> Cheers,
>>
>> Gilles
>>
>> On 3/24/2016 11:30 PM, Sebastian Rettenberger wrote:
>>
>>> Hi,
>>>
>>> I tested this on my desktop machine. Thus, one node, two tasks.
>>> It deadlock appears on the local file system and on the nfs mount.
>>>
>>> The MPICH version I tested was 3.2.
>>>
>>> However, as far as I know, locking is part of the MPI library and not
>>> ROMIO.
>>>
>>> Best regards,
>>> Sebastian
>>>
>>> On 03/24/2016 03:19 PM, Gilles Gouaillardet wrote:
>>>
>>>> Sebastian,
>>>>
>>>> in openmpi 1.10, the default io component is romio from mpich 3.0.4.
>>>>
>>>> how many tasks, how many nodes and which file system are you running
>>>> on ?
>>>>
>>>> Cheers,
>>>>
>>>> Gilles
>>>>
>>>> On Thursday, March 24, 2016, Sebastian Rettenberger
>>>> <rette...@in.tum.de>
>>>> wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> I tried to run the attached program with OpenMPI. It works well
>>>>> with MPICH
>>>>> and Intel MPI but I get a deadlock when using OpenMPI.
>>>>> I am using OpenMPI 1.10.0 with support for MPI_THREAD_MULTIPLE.
>>>>>
>>>>> It seems like ROMIO uses global locks in OpenMPI which is a problem if
>>>>> multiple threads want to do collective I/O.
>>>>>
>>>>> Any idea how one can get around this issue?
>>>>>
>>>>> Best regards,
>>>>> Sebastian
>>>>>
>>>>> --
>>>>> Sebastian Rettenberger, M.Sc.
>>>>> Technische Universität München
>>>>> Department of Informatics
>>>>> Chair of Scientific Computing
>>>>> Boltzmannstrasse 3, 85748 Garching, Germany
>>>>> http://www5.in.tum.de/
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28819.php
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this
>>> post:http://www.open-mpi.org/community/lists/users/2016/03/28820.php
>>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/03/28825.php
>>
>
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/03/28826.php
>
>
-- 
Sebastian Rettenberger, M.Sc.
Technische Universität München
Department of Informatics
Chair of Scientific Computing
Boltzmannstrasse 3, 85748 Garching, Germany
http://www5.in.tum.de/

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to