I'm using 1.8.3 which is Sept 2014.  I'll try some others.

Do you happen to know what the bug is (or a good Google term for finding it)?

ajk

Aaron Kitzmiller
Informatics and Scientific Applications
[email protected]



> On Jul 24, 2015, at 12:42 PM, Matthew Knepley <[email protected]> wrote:
> 
> On Fri, Jul 24, 2015 at 11:36 AM, Aaron Kitzmiller <[email protected] 
> <mailto:[email protected]>> wrote:
> futex is a Linux system call used for locking shared resources.
> 
> It could be indicative of an MPI problem.  I wouldn't be surprised.  If 
> anyone has any idea how to get around it that would be great.  We have dozens 
> of applications on our compute cluster that use MPI, this version being our 
> default.  I'm wondering if there is something specific to the mix of MPI 
> flavor / compiler, etc. that could be going on here.
> 
> Yes, this is a bug in OpenMPI that has been open for years.
> 
> Can you please switch to MPICH and try another test? I thought the newest 
> version of OpenMPI had fixed this, but maybe you are using an older release.
> 
>   Thanks,
> 
>     Matt
>  
> This is the gdb stack trace:
> 
> #0  0x00000039c6a0e264 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x00000039c6a09508 in _L_lock_854 () from /lib64/libpthread.so.0
> #2  0x00000039c6a093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00002aaaaf13ddd4 in opal_mutex_lock (attr_hash=0x2aaaaf651c70, key=128, 
> attribute=0x7fffffffc200, flag=0xffffffffffffffff)
>     at ../opal/threads/mutex_unix.h:104
> #4  ompi_attr_get_c (attr_hash=0x2aaaaf651c70, key=128, 
> attribute=0x7fffffffc200, flag=0xffffffffffffffff)
>     at attribute/attribute.c:758
> #5  0x00002aaaaf17080e in PMPI_Attr_get (comm=0x2aaaaf651c70, keyval=128, 
> attribute_val=0x7fffffffc200, flag=0xffffffffffffffff)
>     at pattr_get.c:61
> #6  0x00002aaaaacad0b3 in Petsc_DelComm_Outer (comm=0x2aaaaf6d4140, 
> keyval=13, attr_val=0x7af160, extra_state=0x0)
>     at /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/pinit.c:409
> #7  0x00002aaaaf13f1a4 in ompi_attr_delete_impl (type=2942639216, 
> object=0x80, attr_hash=0x7fffffffc200, key=-1, predefined=112 'p')
>     at attribute/attribute.c:970
> #8  0x00002aaaaf13ee02 in ompi_attr_delete (type=2942639216, object=0x80, 
> attr_hash=0x7fffffffc200, key=-1, predefined=112 'p')
>     at attribute/attribute.c:1019
> #9  0x00002aaaaf170710 in PMPI_Attr_delete (comm=0x2aaaaf651c70, keyval=128) 
> at pattr_delete.c:59
> #10 0x00002aaaaac61848 in PetscCommDestroy (comm=0x888cf0) at 
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/tagm.c:256
> #11 0x00002aaaaac6a273 in PetscHeaderDestroy_Private (h=0x888ce0)
>     at /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/inherit.c:121
> #12 0x00002aaaaaf51512 in VecDestroy (v=0x7fffffffcbd0)
>     at 
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/vec/vec/interface/vector.c:434
> #13 0x00002aaaab9c5c7f in DMSetUp_DA_2D (da=0x87b1b0) at 
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/da2.c:776
> #14 0x00002aaaaba73bfd in DMSetUp_DA (da=0x87b1b0) at 
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/dareg.c:25
> #15 0x00002aaaab93399a in DMSetUp (dm=0x87b1b0) at 
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/interface/dm.c:560
> #16 0x00002aaaab9c6941 in DMDACreate2d (comm=0x2aaaaf6d45c0, 
> bx=DM_BOUNDARY_NONE, by=DM_BOUNDARY_NONE, 
>     stencil_type=DMDA_STENCIL_STAR, M=-4, N=-4, m=-1, n=-1, dof=1, s=1, 
> lx=0x0, ly=0x0, da=0x7fffffffd668)
>     at /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/da2.c:862
> #17 0x00000000004023d0 in main (argc=1, argv=0x7fffffffd8c8)
>     at 
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/snes/examples/tutorials/ex5.c:116
> 
> 
> Aaron Kitzmiller
> Informatics and Scientific Applications
> [email protected] <mailto:[email protected]>
> 
> 
> 
>> On Jul 24, 2015, at 12:18 PM, Matthew Knepley <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> On Fri, Jul 24, 2015 at 11:17 AM, Matthew Knepley <[email protected] 
>> <mailto:[email protected]>> wrote:
>> On Fri, Jul 24, 2015 at 11:09 AM, Aaron Kitzmiller 
>> <[email protected] <mailto:[email protected]>> wrote:
>> Doesn't run.  Hangs just like the tests do.
>> 
>> I doubt it's helpful, but when I run it under strace, it hangs on a "futex". 
>>  The last thing vaguely informative was an attempt to read the non-existent 
>> .petscrc.
>> 
>> Run in the debugger and get a stack trace.
>> 
>> Also futex does not appear in the PETSc source:
>> 
>>    knepley/feature-snes-deflation *+$|MERGING:/PETSc3/petsc/petsc-dev$ find 
>> src -name "*.c" | xargs grep futex
>>    find src -name "*.c" | xargs grep futex
>> 
>> You have an MPI problem.
>> 
>>    Matt
>>  
>>   Matt
>>  
>> ajk
>> 
>> Aaron Kitzmiller
>> Informatics and Scientific Applications
>> [email protected] <mailto:[email protected]>
>> 
>> 
>> 
>>> On Jul 24, 2015, at 11:21 AM, Matthew Knepley <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>>   ./ex5 -snes_monitor
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments 
>> is infinitely more interesting than any results to which their experiments 
>> lead.
>> -- Norbert Wiener
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments 
>> is infinitely more interesting than any results to which their experiments 
>> lead.
>> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener

Reply via email to