Barry, back in May, we discussed in the commit comments how 'barry/improve-memory-logging' would leave dangling references.
https://bitbucket.org/petsc/petsc/commits/27b6d19d01a4b6521ae8cc73b0b6b8ae1e4130c4?at=barry/improve-memory-logging [The reference there to an issue should be to #43, not #33. https://bitbucket.org/petsc/petsc/issue/44/parent-object-keeps-linked-list-of] I now see that you merged to 'next' in commit 58851192db7fb3a0d13a9d343e7fb3758861e5c5 Merge: 5004532 8990d58 Author: Barry Smith <[email protected]> Date: Sat Aug 17 16:58:27 2013 -0500 Merge branch 'barry/improve-memory-logging' into next and merged to 'master' 45 minutes later (no nightly tests) commit 9b9212943afab704d2d52b88268833e129aca788 Merge: ac1d125 8990d58 Author: Barry Smith <[email protected]> Date: Sat Aug 17 17:45:43 2013 -0500 Merge branch 'barry/improve-memory-logging' However, this is *totally* broken for the reasons discussed in the commit comments above. I encountered this with GAMG Classical, but it can occur in many places. Here is a patch that demonstrates the breakage. diff --git i/src/ksp/ksp/examples/tutorials/ex2.c w/src/ksp/ksp/examples/tutorials/ex2.c index 49200de..5f588c7 100644 --- i/src/ksp/ksp/examples/tutorials/ex2.c +++ w/src/ksp/ksp/examples/tutorials/ex2.c @@ -163,6 +163,15 @@ int main(int argc,char **args) */ ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr); + { + KSP eksp; + PC pc; + ierr = KSPCreate(PETSC_COMM_WORLD,&eksp);CHKERRQ(ierr); + ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); + ierr = KSPSetPC(eksp,pc);CHKERRQ(ierr); + ierr = KSPDestroy(&eksp);CHKERRQ(ierr); + } + /* Set operators. Here the matrix that defines the linear system also serves as the preconditioning matrix. This part actually completes successfully and all is well as far as reference counting is concerned, but ((PetscObject)pc)->parent now points at the old memory for eksp, leading to SEGV the next time PetscLogObjectMemory or PetscLogObjectParent is called. ==10082== Invalid read of size 8 ==10082== at 0x4E8CB27: PetscLogObjectMemory (plog.c:36) ==10082== by 0x5B92EA1: PCCreate_ICC (icc.c:183) ==10082== by 0x5CB48CE: PCSetType (pcset.c:88) ==10082== by 0x5CB58B8: PCSetFromOptions (pcset.c:168) ==10082== by 0x5D89630: KSPSetFromOptions (itcl.c:357) ==10082== by 0x4032F8: main (ex2.c:202) ==10082== Address 0xc736400 is 48 bytes inside a block of size 1,000 free'd ==10082== at 0x4C288AC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==10082== by 0x4EDD7D3: PetscFreeAlign (mal.c:70) ==10082== by 0x5D9B8BE: KSPDestroy (itfunc.c:795) ==10082== by 0x4030EB: main (ex2.c:172) PetscErrorCode PetscLogObjectMemory(PetscObject p,PetscLogDouble m) { p->mem += m; p = p->parent; while (p) { /* Create all ancestors with the memory */ p->memchildren += m; /* <---- Line 36 */ p = p->parent; } return 0; } I still think this is really complicated to do right and will inevitably lead to subtle non-local bugs. In the general case, objects can have multiple parents and multiple children. How would we avoid over-counting? How do we make the addition or removal of parent-child relationships commutative in the sense that the memory logging would be the same? PetscErrorCode PetscLogObjectParent(PetscObject p,PetscObject c) { PetscObject pp = p; if (!c || !p) return 0; while (!c->parent && pp) { /* if not credited elsewhere credit all childs memory to all new ancestors */ pp->memchildren += c->mem + c->memchildren; pp = pp->parent; } c->parent = p; c->parentid = p->id; return 0; } Can we revert this parent traversal on 'master' for now? (I don't like that the old code stores a pointer to the parent at all, but it was benign since it was never used.)
pgpRzXqy4dDHu.pgp
Description: PGP signature
