On Wed, Mar 16, 2016 at 1:21 AM, Tobin Isaac <[email protected]> wrote:
> On Tue, Mar 15, 2016 at 11:47:53PM -0500, Barry Smith wrote: > > > > This is a really nasty problem. The example as previously written was > completely reasonable, so your fix is a total hack :-). All the circular > reference counting in PETSc is problematic because it is so dependent on > exactly the details of how each particular object and its relationships are > handled. > > I agree that the need to call VecSetDM() in that case is bad, and it > stems from assuming that the recycled vectors reference the dm: if > we're going to count circular references, we should actually count > them instead of assuming they exist. > > Where I added DMDestroy() in the Coarsen() routine, however, was in > line with the kind of code we typically expect from users. > > > > > Do we really need to even allow these nasty circular relationships to > exist? What would we lose if we, for example, removed the two way > relationships between the DMs and the Vecs? Just a little efficiency in not > needing to create new Vecs because we can recycle them? But at the cost of > very difficult to debug code that "should just work?" Similarly the nasty > circular dependencies with dm->coarseMesh is done for "efficiency", is > there a way to keep the efficiency but not the tricking dependencies? > > I introduced dm->fineMesh, and I'll consider removing it, but having > both dm->coarseMesh and dm->fineMesh references is about more than > just efficiency. Particularly with the inverted multigrid that > everyone's working on, there are workflows where it is more natural > for the user to just maintain a handle on the coarsest mesh, not the > finest mesh. I think all the references here are completely appropriate. I don't see another way of making many things work than to have the DM know its pool of named vectors. I think it may be that our simplistic reference counting scheme is at fault. However, in this case, I think its clear that your function violated the implied contract for DMCreateGloba/LocalVector(). This should be put in the documentation that the returned vectors need to have the DM set to that DM. Matt > > > > I accept your "fix", thanks for figuring it out so quickly! but don't > like it :-). > > > > Barry > > > > > > > > > On Mar 15, 2016, at 11:30 PM, Tobin Isaac <[email protected]> wrote: > > > > > > > > > I pushed a fix. There's a long explanation in the commit message: > > > while this could be called user error, the cycle counting isn't very > > > robust and should probably be changed. > > > > > > Toby > > > > > > On Tue, Mar 15, 2016 at 09:54:53PM -0500, Barry Smith wrote: > > >> > > >> Dang, dang, dang, I can't believe I fell for that git trapdoor. Ok > pushed now. > > >> > > >> Barry > > >> > > >>> On Mar 15, 2016, at 9:46 PM, Tobin Isaac <[email protected]> > wrote: > > >>> > > >>> > > >>> Barry, please check in ex65.c > > >>> > > >>> On Sun, Mar 13, 2016 at 04:20:06PM -0500, Barry Smith wrote: > > >>>> > > >>>> Toby, > > >>>> > > >>>> I'm trying to put together a very simple but complete DMSHELL > example for [email protected] and having some trouble which I think it > might point to a bug or logical error in the code you wrote for maintaining > dm->coarseMesh and dm->fineMesh and stuff. > > >>>> > > >>>> $ petscmpiexec -valgrind -n 1 ./ex65 -pc_type mg -pc_mg_levels 2 > > >>>> ==80209== Invalid read of size 8 > > >>>> ==80209== at 0x100A9E2D5: DMCountNonCyclicReferences (dm.c:500) > > >>>> ==80209== by 0x100A8F70A: DMDestroy (dm.c:573) > > >>>> ==80209== by 0x101221BBE: KSPDestroy (itfunc.c:985) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== Address 0x10398fd68 is 5,864 bytes inside a block of > size 6,196 free'd > > >>>> ==80209== at 0x10001595D: free (vg_replace_malloc.c:480) > > >>>> ==80209== by 0x1000FE393: PetscFreeAlign (mal.c:72) > > >>>> ==80209== by 0x100100D1E: PetscTrFreeDefault (mtr.c:315) > > >>>> ==80209== by 0x100A91C5A: DMDestroy (dm.c:716) > > >>>> ==80209== by 0x1010E2478: PCDestroy (precon.c:123) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== > > >>>> ==80209== Invalid read of size 8 > > >>>> ==80209== at 0x100A9E2D5: DMCountNonCyclicReferences (dm.c:500) > > >>>> ==80209== by 0x100A8F70A: DMDestroy (dm.c:573) > > >>>> ==80209== by 0x1010E2478: PCDestroy (precon.c:123) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== Address 0x10398fd68 is 5,864 bytes inside a block of > size 6,196 free'd > > >>>> ==80209== at 0x10001595D: free (vg_replace_malloc.c:480) > > >>>> ==80209== by 0x1000FE393: PetscFreeAlign (mal.c:72) > > >>>> ==80209== by 0x100100D1E: PetscTrFreeDefault (mtr.c:315) > > >>>> ==80209== by 0x100A91C5A: DMDestroy (dm.c:716) > > >>>> ==80209== by 0x1010E2478: PCDestroy (precon.c:123) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== > > >>>> ==80209== Invalid read of size 8 > > >>>> ==80209== at 0x100A9E2D5: DMCountNonCyclicReferences (dm.c:500) > > >>>> ==80209== by 0x100A8F70A: DMDestroy (dm.c:573) > > >>>> ==80209== by 0x100001CBC: main (in ./ex65) > > >>>> ==80209== Address 0x10398fd68 is 5,864 bytes inside a block of > size 6,196 free'd > > >>>> ==80209== at 0x10001595D: free (vg_replace_malloc.c:480) > > >>>> ==80209== by 0x1000FE393: PetscFreeAlign (mal.c:72) > > >>>> ==80209== by 0x100100D1E: PetscTrFreeDefault (mtr.c:315) > > >>>> ==80209== by 0x100A91C5A: DMDestroy (dm.c:716) > > >>>> ==80209== by 0x1010E2478: PCDestroy (precon.c:123) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== > > >>>> ==80209== Invalid read of size 8 > > >>>> ==80209== at 0x100A914C4: DMDestroy (dm.c:696) > > >>>> ==80209== by 0x100001CBC: main (in ./ex65) > > >>>> ==80209== Address 0x10398fd68 is 5,864 bytes inside a block of > size 6,196 free'd > > >>>> ==80209== at 0x10001595D: free (vg_replace_malloc.c:480) > > >>>> ==80209== by 0x1000FE393: PetscFreeAlign (mal.c:72) > > >>>> ==80209== by 0x100100D1E: PetscTrFreeDefault (mtr.c:315) > > >>>> ==80209== by 0x100A91C5A: DMDestroy (dm.c:716) > > >>>> ==80209== by 0x1010E2478: PCDestroy (precon.c:123) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== > > >>>> ==80209== Invalid read of size 4 > > >>>> ==80209== at 0x1002319B4: PetscCheckPointer (checkptr.c:106) > > >>>> ==80209== by 0x100A8F5C6: DMDestroy (dm.c:570) > > >>>> ==80209== by 0x100A9156F: DMDestroy (dm.c:699) > > >>>> ==80209== by 0x100001CBC: main (in ./ex65) > > >>>> ==80209== Address 0x10398ece0 is 1,632 bytes inside a block of > size 6,196 free'd > > >>>> ==80209== at 0x10001595D: free (vg_replace_malloc.c:480) > > >>>> ==80209== by 0x1000FE393: PetscFreeAlign (mal.c:72) > > >>>> ==80209== by 0x100100D1E: PetscTrFreeDefault (mtr.c:315) > > >>>> ==80209== by 0x100A91C5A: DMDestroy (dm.c:716) > > >>>> ==80209== by 0x1010E2478: PCDestroy (precon.c:123) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== > > >>>> ==80209== Invalid read of size 4 > > >>>> ==80209== at 0x100A8F630: DMDestroy (dm.c:570) > > >>>> ==80209== by 0x100A9156F: DMDestroy (dm.c:699) > > >>>> ==80209== by 0x100001CBC: main (in ./ex65) > > >>>> ==80209== Address 0x10398ece0 is 1,632 bytes inside a block of > size 6,196 free'd > > >>>> ==80209== at 0x10001595D: free (vg_replace_malloc.c:480) > > >>>> ==80209== by 0x1000FE393: PetscFreeAlign (mal.c:72) > > >>>> ==80209== by 0x100100D1E: PetscTrFreeDefault (mtr.c:315) > > >>>> ==80209== by 0x100A91C5A: DMDestroy (dm.c:716) > > >>>> ==80209== by 0x1010E2478: PCDestroy (precon.c:123) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== > > >>>> ==80209== Invalid read of size 4 > > >>>> ==80209== at 0x100A8F641: DMDestroy (dm.c:570) > > >>>> ==80209== by 0x100A9156F: DMDestroy (dm.c:699) > > >>>> ==80209== by 0x100001CBC: main (in ./ex65) > > >>>> ==80209== Address 0x10398ece0 is 1,632 bytes inside a block of > size 6,196 free'd > > >>>> ==80209== at 0x10001595D: free (vg_replace_malloc.c:480) > > >>>> ==80209== by 0x1000FE393: PetscFreeAlign (mal.c:72) > > >>>> ==80209== by 0x100100D1E: PetscTrFreeDefault (mtr.c:315) > > >>>> ==80209== by 0x100A91C5A: DMDestroy (dm.c:716) > > >>>> ==80209== by 0x1010E2478: PCDestroy (precon.c:123) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x1010BCBFC: PCDestroy_MG (mg.c:302) > > >>>> ==80209== by 0x1010E23F7: PCDestroy (precon.c:122) > > >>>> ==80209== by 0x101221C3A: KSPDestroy (itfunc.c:986) > > >>>> ==80209== by 0x100001C4C: main (in ./ex65) > > >>>> ==80209== > > >>>> [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > >>>> [0]PETSC ERROR: Invalid argument > > >>>> [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > >>>> [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > >>>> [0]PETSC ERROR: Petsc Development GIT revision: > pre-tsfc-829-g3974c78 GIT Date: 2016-03-11 17:51:48 -0600 > > >>>> [0]PETSC ERROR: ./ex65 on a arch-basic named > Barrys-MacBook-Pro.local by barrysmith Sun Mar 13 16:13:10 2016 > > >>>> [0]PETSC ERROR: Configure options > --with-mpi-dir=/Users/barrysmith/PetscLibraries PETSC_ARCH=arch-basic > > >>>> [0]PETSC ERROR: #1 DMDestroy() line 570 in > /Users/barrysmith/Src/petsc/src/dm/interface/dm.c > > >>>> [0]PETSC ERROR: #2 DMDestroy() line 699 in > /Users/barrysmith/Src/petsc/src/dm/interface/dm.c > > >>>> [0]PETSC ERROR: #3 main() line 67 in > /Users/barrysmith/Src/petsc/src/ksp/ksp/examples/tutorials/ex65.c > > >>>> [0]PETSC ERROR: PETSc Option Table entries: > > >>>> [0]PETSC ERROR: -malloc_test > > >>>> [0]PETSC ERROR: -pc_mg_levels 2 > > >>>> [0]PETSC ERROR: -pc_type mg > > >>>> [0]PETSC ERROR: ----------------End of Error Message -------send > entire error message to [email protected] > > >>>> > > >>>> The code is in the branch barry/add-dmshellcreaterestriction > src/ksp/ksp/examples/tutorials/ex65.c which creates a DMSHELL that just > uses an inner DMDA1 to create the objects. The code is virtually identical > to ex25.c which just uses the DMDA1d directly but does not crash. It seems > to me that having the DM objects be shells instead of DMDA should make > absolutely no difference in your logic for tracking dm->coarseMesh etc but > somehow something is fishy!!!! I could have a mistake in my example code > but I do not think so. > > >>>> > > >>>> Could you please take a look at the problem, feel free to add > fixes directly to the branch. > > >>>> > > >>>> Thanks > > >>>> > > >>>> Barry > > >>>> > > >>>> > > >> > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
