The osc hang is fixed by a PR to fix bugs in start in cm and ob1. See #1729.

-Nathan

> On Jun 2, 2016, at 5:17 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@gmail.com> wrote:
> 
> fwiw,
> 
> the onsided/c_fence_lock test from the ibm test suite hangs
> 
> (mpirun -np 2 ./c_fence_lock)
> 
> i ran a git bisect and it incriminates commit 
> b90c83840f472de3219b87cd7e1a364eec5c5a29
> 
> commit b90c83840f472de3219b87cd7e1a364eec5c5a29
> Author: bosilca <bosi...@users.noreply.github.com>
> Date:   Tue May 24 18:20:51 2016 -0500
> 
>     Refactor the request completion (#1422)
>     
>     * Remodel the request.
>     Added the wait sync primitive and integrate it into the PML and MTL
>     infrastructure. The multi-threaded requests are now significantly
>     less heavy and less noisy (only the threads associated with completed
>     requests are signaled).
>     
>     * Fix the condition to release the request.
> 
> 
> 
> 
> I also noted a warning is emitted when running only one task
> 
> ./c_fence_lock
> 
> but I did not git bisect, so that might not be related
> 
> Cheers,
> 
> 
> 
> Gilles
> 
> 
>> On Thursday, June 2, 2016, Ralph Castain <r...@open-mpi.org> wrote:
>> Yes, please! I’d like to know what mpirun thinks is happening - if you like, 
>> just set the —timeout N —report-state-on-timeout flags and tell me what 
>> comes out
>> 
>>> On Jun 1, 2016, at 7:57 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
>>> 
>>> I don't think it matters. I was running the IBM collective and pt2pt tests, 
>>> but each time it deadlocked was in a different test. If you are interested 
>>> in some particular values, I would be happy to attach a debugger next time 
>>> it happens.
>>> 
>>>   George.
>>> 
>>> 
>>>> On Wed, Jun 1, 2016 at 10:47 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> What kind of apps are they? Or does it matter what you are running?
>>>> 
>>>> 
>>>> > On Jun 1, 2016, at 7:37 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
>>>> >
>>>> > I have a seldomly occurring deadlock on a OS X laptop if I use more than 
>>>> > 2 processes). It is coming up once every 200 runs or so.
>>>> >
>>>> > Here is what I could gather from my experiments: All the MPI processes 
>>>> > seem to have correctly completed (I get all the expected output and the 
>>>> > MPI processes are in a waiting state), but somehow the mpirun does not 
>>>> > detect their completion. As a result, mpirun never returns.
>>>> >
>>>> >   George.
>>>> >
>>>> > _______________________________________________
>>>> > devel mailing list
>>>> > de...@open-mpi.org
>>>> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> > Searchable archives: 
>>>> > http://www.open-mpi.org/community/lists/devel/2016/06/19054.php
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/devel/2016/06/19054.php
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2016/06/19055.php
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/06/19059.php

Reply via email to