I'm using 1.8.3 which is Sept 2014. I'll try some others. Do you happen to know what the bug is (or a good Google term for finding it)?
ajk Aaron Kitzmiller Informatics and Scientific Applications [email protected] > On Jul 24, 2015, at 12:42 PM, Matthew Knepley <[email protected]> wrote: > > On Fri, Jul 24, 2015 at 11:36 AM, Aaron Kitzmiller <[email protected] > <mailto:[email protected]>> wrote: > futex is a Linux system call used for locking shared resources. > > It could be indicative of an MPI problem. I wouldn't be surprised. If > anyone has any idea how to get around it that would be great. We have dozens > of applications on our compute cluster that use MPI, this version being our > default. I'm wondering if there is something specific to the mix of MPI > flavor / compiler, etc. that could be going on here. > > Yes, this is a bug in OpenMPI that has been open for years. > > Can you please switch to MPICH and try another test? I thought the newest > version of OpenMPI had fixed this, but maybe you are using an older release. > > Thanks, > > Matt > > This is the gdb stack trace: > > #0 0x00000039c6a0e264 in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00000039c6a09508 in _L_lock_854 () from /lib64/libpthread.so.0 > #2 0x00000039c6a093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #3 0x00002aaaaf13ddd4 in opal_mutex_lock (attr_hash=0x2aaaaf651c70, key=128, > attribute=0x7fffffffc200, flag=0xffffffffffffffff) > at ../opal/threads/mutex_unix.h:104 > #4 ompi_attr_get_c (attr_hash=0x2aaaaf651c70, key=128, > attribute=0x7fffffffc200, flag=0xffffffffffffffff) > at attribute/attribute.c:758 > #5 0x00002aaaaf17080e in PMPI_Attr_get (comm=0x2aaaaf651c70, keyval=128, > attribute_val=0x7fffffffc200, flag=0xffffffffffffffff) > at pattr_get.c:61 > #6 0x00002aaaaacad0b3 in Petsc_DelComm_Outer (comm=0x2aaaaf6d4140, > keyval=13, attr_val=0x7af160, extra_state=0x0) > at /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/pinit.c:409 > #7 0x00002aaaaf13f1a4 in ompi_attr_delete_impl (type=2942639216, > object=0x80, attr_hash=0x7fffffffc200, key=-1, predefined=112 'p') > at attribute/attribute.c:970 > #8 0x00002aaaaf13ee02 in ompi_attr_delete (type=2942639216, object=0x80, > attr_hash=0x7fffffffc200, key=-1, predefined=112 'p') > at attribute/attribute.c:1019 > #9 0x00002aaaaf170710 in PMPI_Attr_delete (comm=0x2aaaaf651c70, keyval=128) > at pattr_delete.c:59 > #10 0x00002aaaaac61848 in PetscCommDestroy (comm=0x888cf0) at > /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/tagm.c:256 > #11 0x00002aaaaac6a273 in PetscHeaderDestroy_Private (h=0x888ce0) > at /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/inherit.c:121 > #12 0x00002aaaaaf51512 in VecDestroy (v=0x7fffffffcbd0) > at > /n/home08/lchristakis/petsc/petsc-3.5.4/src/vec/vec/interface/vector.c:434 > #13 0x00002aaaab9c5c7f in DMSetUp_DA_2D (da=0x87b1b0) at > /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/da2.c:776 > #14 0x00002aaaaba73bfd in DMSetUp_DA (da=0x87b1b0) at > /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/dareg.c:25 > #15 0x00002aaaab93399a in DMSetUp (dm=0x87b1b0) at > /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/interface/dm.c:560 > #16 0x00002aaaab9c6941 in DMDACreate2d (comm=0x2aaaaf6d45c0, > bx=DM_BOUNDARY_NONE, by=DM_BOUNDARY_NONE, > stencil_type=DMDA_STENCIL_STAR, M=-4, N=-4, m=-1, n=-1, dof=1, s=1, > lx=0x0, ly=0x0, da=0x7fffffffd668) > at /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/da2.c:862 > #17 0x00000000004023d0 in main (argc=1, argv=0x7fffffffd8c8) > at > /n/home08/lchristakis/petsc/petsc-3.5.4/src/snes/examples/tutorials/ex5.c:116 > > > Aaron Kitzmiller > Informatics and Scientific Applications > [email protected] <mailto:[email protected]> > > > >> On Jul 24, 2015, at 12:18 PM, Matthew Knepley <[email protected] >> <mailto:[email protected]>> wrote: >> >> On Fri, Jul 24, 2015 at 11:17 AM, Matthew Knepley <[email protected] >> <mailto:[email protected]>> wrote: >> On Fri, Jul 24, 2015 at 11:09 AM, Aaron Kitzmiller >> <[email protected] <mailto:[email protected]>> wrote: >> Doesn't run. Hangs just like the tests do. >> >> I doubt it's helpful, but when I run it under strace, it hangs on a "futex". >> The last thing vaguely informative was an attempt to read the non-existent >> .petscrc. >> >> Run in the debugger and get a stack trace. >> >> Also futex does not appear in the PETSc source: >> >> knepley/feature-snes-deflation *+$|MERGING:/PETSc3/petsc/petsc-dev$ find >> src -name "*.c" | xargs grep futex >> find src -name "*.c" | xargs grep futex >> >> You have an MPI problem. >> >> Matt >> >> Matt >> >> ajk >> >> Aaron Kitzmiller >> Informatics and Scientific Applications >> [email protected] <mailto:[email protected]> >> >> >> >>> On Jul 24, 2015, at 11:21 AM, Matthew Knepley <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> ./ex5 -snes_monitor >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener
