Re: [Jsource] Reference counts (and question for Roger)

Marshall Lochbaum Fri, 20 May 2016 07:07:06 -0700

It works, assuming the function that calls gc is rewritten so that it
copies, derecs, and pops elements before adding them to w.


It's also possible to write a version of gc that works without rewrites,
by calling derec on all of w's children rather than w. But I would
rather not write that code at all if it isn't needed.

Marshall

On Fri, May 20, 2016 at 07:44:42AM -0400, Henry Rich wrote:
> gc() is
> 
> A jtgc (J jt,A w,I old){ra(w); tpop(old); R tpush(w);}
> 
> Would it be valid to move the ra(w) to before the loop, and the
> tpush(w) to after?  That would save overhead; I don't know if it
> would help with the problem.
> 
> Henry
> 
> On 5/17/2016 6:25 PM, Marshall Lochbaum wrote:
> >I've tracked down one problem with the new reference count system, which
> >accounts for all but 10 (exactly) test failures.
> >
> >A number of spots in the J source follow this pattern, or similar:
> >   store the current stack top in variable old.
> >   for(...) {
> >     add something to boxed array w.
> >     call gc(w,old), putting w in non-recursive form.
> >   }
> >
> >(A particular example is cp.c, lines 91--97, which performs (v^:l x)
> >where l is an integer list).
> >
> >This violates our requirement 1, "Descendants are only added to objects
> >marked AFREC." Under the new system, w is no longer recursive after the
> >first gc call. Subsequent calls copy w by adding one to its reference
> >count, but fail to copy the added descendants. When the stack is popped,
> >they are destroyed.
> >
> >But why is the gc call even needed in the first place? It clears unused
> >things on the stack between calls, sure. But that seems like a weak
> >justification since most of these loops are pretty small. The functions
> >they call should do this stack-clearing for them as well--either with
> >EPILOG, or simply because they don't use temporary variables. Replacing
> >these calls to gc and gc3 with noops fixes the errors, but I could see
> >it causing performance problems.
> >
> >Is there another purpose that I'm missing? Is it safe to remove these
> >calls?
> >
> >Marshall
> >
> >On Wed, May 11, 2016 at 07:05:54PM -0700, chris burke wrote:
> >>From: *Marshall Lochbaum* <[email protected]>
> >>Date: 29 April 2016 at 09:51
> >>To: [email protected]
> >>
> >>
> >>I am currently working on changing J's current reference counting scheme
> >>to a much more efficient and standard method.
> >>
> >>J stores a reference count in every object (AC(w)) indicating the number
> >>of times it is referenced. The count is incremented when a copy is made
> >>and decremented when one is deleted. Once the reference count drops to
> >>zero, J knows the object can be freed, and does so.
> >>
> >>Currently, J uses an unusual recursive procedure for reference counts.
> >>If a boxed array (or verb with arguments, or other object with children)
> >>is copied, then the reference counts of all its children (items in
> >>boxes) are incremented, and so on recursively. This makes copies of
> >>composite objects take a long time.
> >>
> >>The typical procedure is to treat the parent/child relationship as a
> >>single reference. Thus the reference counts of children are incremented
> >>when they are added to the parent, and decremented when the parent is
> >>deleted, and otherwise not changed. To copy a composite object, simply
> >>increment its reference count, and to delete, decrement it. I am
> >>currently trying to convince the J engine to use this procedure.
> >>
> >>The hangup is another idiosyncratic J feature, which in contrast to the
> >>weird reference count scheme is actually useful. J maintains a stack of
> >>local nouns, whose function (but not implementation) is equivalent to
> >>that of C's stack. Any time an object is created, a pointer to it is
> >>pushed to the stack. Thus J functions can ignore all the reference count
> >>arithmetic (which from experience I know is quite painful) while using
> >>objects. Instead, they call PROLOG, which saves the location of the
> >>beginning of the stack, before doing anything, and EPILOG(z), where z is
> >>the value to be returned, after everything. EPILOG copies z, deletes
> >>everything added to the stack since PROLOG was called, and returns z.
> >>
> >>A complication arises if z is a composite object. In this case we want
> >>to avoid deleting any of z's children, despite the fact that they may be
> >>on the stack. In the current paradigm this is automatic: on copying z,
> >>all of its children are also copied, and deleting the copies on the
> >>stack leaves those copies. But if incrementing the reference count is
> >>not recursive, this doesn't happen, and the children are deleted,
> >>leaving z with dangling pointers.
> >>
> >>In a sense, the problem arises because when verbs connect z to its
> >>children, they don't bother to inform the children. The reference from z
> >>to its children cannot be reflected in their reference counts. So the
> >>correct fix might be to increment the reference counts as they are added
> >>to z. This is not feasible: the places where children are added to
> >>objects just look like assignments, and are impossible to detect
> >>automatically.
> >>
> >>There is a workaround: EPILOG should do the pointer-incrementing job for
> >>us. So rather than a simple copy of z, it actually goes through all of
> >>z's children and increments pointers. Specifically, it should increment
> >>the descendents which are children of objects on the stack. This
> >>requires an extra flag: AFSTK, which indicates whether an object is on
> >>the stack.
> >>
> >>I have implemented all the above, and am still getting bugs. I think
> >>that I have made all the necessary architecture changes, and that the
> >>remainder is just a few spots that relied in some way on the old system,
> >>but it's impossible to be sure how much work there is left to do. I will
> >>update on future progress.
> >>
> >>Marshall
> >>
> >>----------
> >>From: *Henry Rich* <[email protected]>
> >>Date: 29 April 2016 at 12:40
> >>To: [email protected]
> >>
> >>
> >>Your workaround seems sound.  Let me know what you find the problems are.
> >>
> >>To see if I understand the situation, would this work: keep a long long
> >>counter of PROLOGs, giving each PROLOG a unique number; store the current
> >>number in the block header when a block is created.  Then at EPILOG,
> >>increment use-counts in the children of Z that have the stored value
> >>matching the current number.  This would add 8 bytes to the header, so I
> >>don't like it, but it would keep you from having to match children-of-z
> >>with children-of-the-noun-stack.
> >>
> >>Henry
> >>
> >>----------
> >>From: *Henry Rich* <[email protected]>
> >>Date: 29 April 2016 at 17:18
> >>To: [email protected]
> >>
> >>
> >>Is one way to think of this problem that the use-counts in z need to be the
> >>old recursive style, so that they can be freed by the epilogue?
> >>
> >>If so, a solution would be to traverse z, converting its use-counts to
> >>recursive style (and incrementing - I guess that's what you referred to as
> >>'copying'); decrement them in recursive style; then traverse again,
> >>converting back to nonrecursive style.
> >>
> >>Would that work?
> >>
> >>If so, a more elegant solution would be to have two use counts: a count
> >>that has yet to be propagated to descendants, and a count that has been
> >>propagated.  The use-count routines would handle these correctly.  EPILOG
> >>would traverse to push unpropagated use-count to descendants, and then free
> >>the blocks as before.  The conversion back to nonrecursive style would not
> >>be required, because both styles would coexist.
> >>
> >>Implementation: I have my eye on the AF() field, which seems to have just 4
> >>bits assigned - is that right?  Reserve 6 bits for flags, and use the upper
> >>bits for the unpropagated usecount.  At the same time, make the
> >>corresponding change to AC(), leaving 6 bits of flags and the rest
> >>propagated usecount.
> >>
> >>
> >>Henry
> >>
> >>
> >>On 4/29/2016 12:51 PM, Marshall Lochbaum wrote:
> >>
> >>----------
> >>From: *Marshall Lochbaum* <[email protected]>
> >>Date: 29 April 2016 at 21:56
> >>To: [email protected]
> >>
> >>
> >>I'm not sure whether the first solution would work. Converting between
> >>the two styles is simple enough--recursively add or remove the reference
> >>count of the parent from all of its descendents--but the overall
> >>procedure is kind of complicated, and I still don't know exactly how the
> >>stack is used all of the time.
> >>
> >>I don't think that nouns are ever copied while on the stack, just used
> >>without regard for their reference count. So the second method is the
> >>same as mine, but mine combines the two counts.
> >>
> >>I assume you mean AFLAG for the field. There are 17 extra flags, defined
> >>only for verbs, further on in jtype.h ("type V flag values"). These end
> >>at 2^24, leaving the top 32 bits untouched. But given how quickly we
> >>seem to be proposing new flags, it may not be a good idea to use those.
> >>
> >>Marshall
> >>
> >>----------
> >>From: *Henry Rich* <[email protected]>
> >>Date: 30 April 2016 at 04:34
> >>To: [email protected]
> >>
> >>
> >>[I thought the V flags were for the flag field of the V struct, not the A
> >>struct.]
> >>
> >>I have better ideas about this problem now.
> >>
> >>First, though, let me say this.  When there were still problems after you
> >>put in the nonrecursive use-counts, and you suspected some code that relied
> >>on the old methods: that's almost a fatal problem, isn't it?  Because we
> >>can't hope to find all the old code, and would never want to release until
> >>we were sure we had.  We need a design that guarantees compatibility.
> >>
> >>
> >>
> >>Here's what you have learned: execution freely allocates blocks, putting
> >>them on the stack, and combines them promiscuously, producing objects, one
> >>of which becomes the result z for use in EPILOG(z).  The objects produced
> >>by execution all have the old recursive use-counts, and thus must be freed
> >>using the recursive method.
> >>
> >>Yet we would very much like objects to use nonrecursive use-counts.  This
> >>suggests the following design:
> >>
> >>Blocks are either nonrecursive or semirecursive and flagged to indicate
> >>which kind they are.  The use-count of a nonrecursive block has not fully
> >>been included in the use-counts of its descendants, and need not be, while
> >>the use-count of a semirecursive block has been propagated to descendants.
> >>
> >>The descendants of a nonrecursive block are all nonrecursive.  The
> >>descendants of a semirecursive block can be either type.
> >>Incrementing/decrementing the use count involves traversing the tree, but
> >>the traversal can be terminated when a nonrecursive block is encountered.
> >>
> >>Nonrecursive blocks are created ONLY by EPILOG(z).  After the stack has
> >>been freed, the result z is traversed to transform it from semirecursive to
> >>nonrecursive.
> >>
> >>Important lemma (please check!): No block pointed to by the stack will ever
> >>be contained in a nonrecursive block. Proof: All blocks are allocated as
> >>semirecursive, and execution only combines these into larger semirecursive
> >>aggregates. (replacing-in-place is not implemented for indirect data types
> >>- yet).
> >>
> >>
> >>
> >>That's the design.  It's saying that before EPILOG, anything goes.
> >>Whatever the Old Code produces is OK.  At EPILOG() time, the surviving z is
> >>then put into efficient nonrecursive form, after the stack has been taken
> >>care of.
> >>
> >>Examples:
> >>
> >>J sentence:   a (; <@(+/)"1) b
> >>
> >><@(+/)"1 will run on b to produce its result, in semirecursive form.
> >>EPILOG will turn this into the nonrecursive result, call it c.
> >>
> >>a starts out nonrecursive (if it is an indirect type), because it is the
> >>result of some earlier EPILOG.
> >>
> >>a ; c   will join the two nonrecursive values into a semirecursive result.
> >>But traversing that result during EPILOG will be fast, because the
> >>traversal may stop after one level of recursion.
> >>
> >>J sentence: ({. a) ; 2 {:: b
> >>
> >>(2 {:: b) selects the subtree of b (and presumably increments its use
> >>count, but that's in the Old Code and we don't have to worry about it).
> >>This subtree will most likely be nonrecursive. Depending on the type of a,
> >>the result of ({. a) is either a new semirecursive block or the address of
> >>the nonrecursive subtree of a.  These are joined by ; into a semirecursive
> >>result, which again can be traversed quickly.  The key point is that no
> >>block on the stack can ever appear inside a nonrecursive block and thus
> >>cannot cause trouble when the stack is freed.
> >>
> >>
> >>
> >>This design needs a flag NONRECURSIVE, but it does not need the ONSTACK
> >>flag.
> >>
> >>Henry
> >>
> >>----------
> >>From: *Marshall Lochbaum* <[email protected]>
> >>Date: 30 April 2016 at 16:43
> >>To: [email protected]
> >>
> >>
> >>I think this is the scheme I was grasping for. The recursive flag (I
> >>called it AFREC) is just a better name for the stack flag--in fact
> >>objects are returned to the stack after being pulled off in EPILOG, so
> >>that name is wrong. The major change is realizing that ra and fa (copy
> >>and delete copy) should work differently (recursively) on objects with
> >>the flag set.
> >>
> >>It seems to work, but there are problems with the end of jtxdefn, which
> >>has its own variant of EPILOG (jtxdefn is the only such function, as I
> >>have verified by checking for uses of tpush outside m.c). This epilog
> >>frees the symbol table for local variables, but it seems this is done
> >>incorrectly as it causes invalid reads later on. It shouldn't take too
> >>long to fix.
> >>
> >>And you're right about the V flags versus A flags. The perils of untyped
> >>data...
> >>
> >>Marshall
> >>
> >>----------
> >>From: *Henry Rich* <[email protected]>
> >>Date: 30 April 2016 at 16:57
> >>To: [email protected]
> >>
> >>
> >>That sounds promising.  I think we're in agreement.
> >>
> >>Does EPILOG(z) put anything back onto the stack except for z?  [I'm just
> >>trying to make sure I keep up with how it works (but trying to avoid
> >>looking through the code).]  If it does, do you have to do anything
> >>different when z is eventually freed? Not that there's a problem, since you
> >>have the recursive flag to tell you what to do.
> >>
> >>I guess I'd better understand this code if I'm going to talk about it.  I
> >>would look in PROLOG, EPILOG, and where else to see how the stack is used?
> >>
> >>
> >>If we wanted to save 8 bytes in the header, we could move those A flags
> >>into the low bits of the use count.  Not now, but someday...
> >>
> >>
> >>If we couldn't deal with untyped data, we wouldn't be working in J, would
> >>we?
> >>
> >>Henry
> >>
> >>----------
> >>From: *Marshall Lochbaum* <[email protected]>
> >>Date: 30 April 2016 at 17:22
> >>To: [email protected]
> >>
> >>
> >>EPILOG(z) pushes z onto the stack (with tpush). Currently, this is
> >>recursive, so it pushes all of z's descendents as well. In the new
> >>paradigm, it isn't, and only z is added.
> >>
> >>PROLOG and EPILOG are very simple:
> >>#define PROLOG          I _ttop=jt->tbase+jt->ttop
> >>#define EPILOG(z)       R gc(z,_ttop)
> >>A jtgc (J jt,A w,I old){ra(w); tpop(old); R tpush(w);}
> >>
> >>where ra copies w and tpush(w) adds it to the stack. tpop(old) pops
> >>(with fr, which is not recursive) all the items after index old from the
> >>stack. In the new version I have replace tpush with a different function
> >>that derecursivinates the argument, then pushes it.
> >>
> >>Since I've written out most of it, I will probably make my comments on
> >>memory management into a complete guide and post it somewhere soon.
> >>
> >>Marshall
> >>
> >>----------
> >>From: *Henry Rich* <[email protected]>
> >>Date: 1 May 2016 at 07:33
> >>To: [email protected]
> >>
> >>
> >>I think this works, but I believe my lemma from last post is flawed (more
> >>below).  First, some advice _de profundis_.
> >>
> >>I urge you to write out your comments right now, before writing any (more)
> >>code, as a way to prove the design before coding.  This should take the
> >>form of a short paper, included as commentary in the code, in which you
> >>describe the principles of the design, the actions that can be performed,
> >>and the assertions that can be made about the state of the system at
> >>various times.  Then, give a passably rigorous proof of the assertions.
> >>
> >>This is a good way to design, and I have the scars to prove it.  Many times
> >>I have had to tear up a design because the attempt to formally prove
> >>correctness turned up flaws.  Not only will you save coding time, you will
> >>learn where you need to spend your testing effort.
> >>
> >>Of course, all who come after you will be grateful for the commentary.  But
> >>you yourself will get your greatest benefit from it the earlier you write
> >>it.  Post it here by all means when it's done, but makes sure it gets into
> >>the code.
> >>
> >>
> >>Now, about those memory blocks.  Here's my current understanding.
> >>
> >>All allocated blocks bear the curse of Adam: burdened with Original Sin,
> >>the time of their doom is marked on the stack at the moment of their
> >>birth.  Their use-count is 1, but that is borrowed time.  After their
> >>threescore and ten, they return, dust to dust, to the memory pool.
> >>
> >>All, that is, except one elect from each generation.  This virtuous block,
> >>z, is selected to carry the faith to the next generation.  The use-counts
> >>of z are incremented, expunging the Original Sin, and z survives the
> >>reckoning.  But only temporarily: z is then put back onto the stack,
> >>reinstating the Original Sin, and z's added life is possibly for just one
> >>generation, possibly for more, according to the will of the Lord.
> >>
> >>When z is put back onto the stack, it is as a born-again nonrecursive
> >>block, in which the top block of z answers for all its descendants.
> >>
> >>Here is the question: are we sure z is up to the task?  I thought last time
> >>I had proved Yes, that z could not contain blocks that are on the stack
> >>slated for destruction.  But now I see that that is false: you could have
> >>this scenario:
> >>
> >>Call level 1: PROLOG; create block A; pass block A into call level 2
> >>Call level 2: PROLOG; create block B, an indirect block that points to A;
> >>EPILOG(B)
> >>
> >>The stack now points to A and B; B is marked nonrecursive, but what about A?
> >>
> >>I think the answer is that it has to be OK, because Call level 2 must have
> >>incremented the use-count of A when it incorporated A into B.  That is, if
> >>it didn't the system wouldn't work to begin with, so we don't have to be
> >>concerned with the details of how that happens.
> >>
> >>This suggests a corollary concerning the nonrecursive flag: it indicates
> >>'this block has been through EPILOG and had its use-counts incremented.'
> >>That inoculates it against having itself OR ANY COMPONENT destroyed before
> >>its time.
> >>
> >>This scenario imposes a requirement on the derecursivinative procedure: it
> >>must not make any attempt to collect use-counts and forward them to higher
> >>levels.  One might think it clever to notice that all the use-counts of the
> >>children of a node are 2, and pull one use-count into the parent.  It
> >>_would_ be clever, except that it would leave the child exposed to
> >>destruction from the stack or destruction of a name.
> >>
> >>[You should include the discussion of the previous paragraph, including the
> >>issues, decisions, and alternatives, in your design document.  It will help
> >>readers greatly to understand the issues involved.  Also include a proof of
> >>the OR ANY COMPONENT statement two paragraphs back.]
> >>
> >>Henry
> >>
> >>----------
> >>From: *Marshall Lochbaum* <[email protected]>
> >>Date: 5 May 2016 at 12:02
> >>To: [email protected]
> >>
> >>
> >>The problems I thought were due to the not-EPILOG in jtxdefn were
> >>actually a result of a change I made to debug: reducing PLIM (and PLIML,
> >>its logarithm) in m.c so that more objects would be allocated on the C
> >>stack rather than J's pool. This makes more memory leaks detectable with
> >>a tool like valgrind, since leaked objects in the pool are still seen as
> >>reachable. Unfortunately and for reasons I don't understand, it also
> >>introduces a number of bugs.
> >>
> >>The new memory scheme now passes a lot of tests, but fails a few.
> >>Continuing investigation.
> >>
> >>Marshall
> >>
> >>----------
> >>From: *Marshall Lochbaum* <[email protected]>
> >>Date: 5 May 2016 at 13:52
> >>To: [email protected]
> >>
> >>
> >>Here's the formal(ish) proof that managing memory derecursivinatively
> >>doesn't break anything that wasn't already broken. There are two
> >>assumptions about current the J behavior, which are collected at the
> >>end.
> >>
> >>Definition:
> >>A referrer is either a J object or the stack. An object references
> >>another object if that object would be visited by a call to traverse
> >>(with a non-recursive function as its argument). The stack is a list of
> >>pointers; it references each object it points to. A referrer may
> >>reference an object multiple times if it is visited multiple times by
> >>traverse or appears multiple times in the stack. The list of objects it
> >>refers to are its referents.
> >>
> >>Code specification:
> >>Every object is created with AFREC set and reference count zero. Calling
> >>EPILOG(z) copies z, then pops from the stack, then calls derec(z,0).
> >>Calling derec(w,n), where w's reference count is initially m, decrements
> >>w's reference count by n. Then, if w has AFREC set, it calls
> >>derec(z,m-1) on each object referred to by z and removes the flag.
> >>There is no other way to remove the AFREC flag, and no way to add the
> >>AFREC flag at all.
> >>
> >>derec(z,0) is equivalent to derec1(z), which uses more traversals:
> >>derec1(z) does nothing if z does not have AFREC set, and otherwise
> >>decrements the counts of each of z's descendants by one less than z's
> >>reference count, then recurses.
> >>
> >>Invariant 1:
> >>At all times except during calls to derec, an object z that refers to an
> >>object marked AFREC is also marked AFREC. Equivalently, descendants of
> >>non-recursive objects are never recursive.
> >>Assume: Descendants are added only to objects not marked AFREC.
> >>Proof: When z is created, it has no descendants, so the property holds.
> >>Three events can potentially falsify the invariant:
> >>- A new descendant is added to z. If z does not have AFREC set, then
> >>   descendants cannot be added to it, by the assumption. If it does have
> >>   AFREC set, the invariant holds regardless.
> >>- The AFREC flag is added to a descendant of z. This is impossible.
> >>- The AFREC flag is removed from z--that is, derec is called on z, and
> >>   AFREC is set for z. In this case, derec removes the flag and recurses.
> >>   But then the flag will be removed for each descendant as well, and the
> >>   invariant still holds.
> >>
> >>Invariant 2:
> >>The reference count of an object is the sum over all referrers (with
> >>multiplicity) of either 1, if the referrer is the stack or an object not
> >>marked AFREC, or that object's reference count, if the referrer is an
> >>object marked AFREC.
> >>Except during the execution of memory functions, this invariant holds
> >>- At all times for objects not marked AFREC, and
> >>- When EPILOG(z) is called for objects z marked AFREC and their
> >>   descendants.
> >>We assume the invariant in the second case. If not, the old system was
> >>broken.
> >>We now show that derec1(z), equivalent to the main work of EPILOG
> >>derec(z,0), does not falsify the invariant. We omit the proof that other
> >>memory operations (ra, fa, tpop) satisfy it, as it is simple in each
> >>case.
> >>- If AFREC is not set, then derec1(z) does nothing by definition.
> >>- If AFREC is set, then derec1(z) removes the AFREC flag from z, which
> >>   decreases the corresponding terms in the reference formula from m to
> >>   1, where m is the reference count of z. It also decrements the counts
> >>   of z's descendants, to the same effect. Note that the formula counts
> >>   referrers with multiplicity, and traverse does the same. Then derec1
> >>   recurses, which maintains our invariant proveded the rest of the
> >>   function does as well.
> >>
> >>
> >>So, our assumptions about J's code base are:
> >>
> >>1. Descendants are only added to objects marked AFREC.
> >>This could be violated by some things like in-place amend. If it is, it
> >>will break invariant 1, which is easy to test for in derec by doing a
> >>full recurse. The fix is to make sure each addition calls derec on the
> >>added objects.
> >>
> >>2. When EPILOG(z) is called, z and its descendants obey the reference
> >>    count formula.
> >>Mostly, this depends on J working already. The only way the changes can
> >>break it is if a function does strange memory stuff (not fa or ra) to an
> >>object not marked AFREC. But that's not really out of the question.
> >>
> >>There are about 40 failing tests right now, and they're probably due to
> >>specific functions that break one of the invariants. Hopefully there are
> >>a lot fewer than 40 functions causing these failures.
> >>
> >>Marshall
> >>
> >>----------
> >>From: *Henry Rich* <[email protected]>
> >>Date: 10 May 2016 at 16:28
> >>To: [email protected]
> >>
> >>
> >>Just starting to look at this.
> >>
> >>Creating a block with AFREC set and reference count 0 may be trouble.  I
> >>see in places where the code checks for AC(w)>1 to indicate that a block
> >>has been put into a boxed structure.  That would not work in the new system.
> >>
> >>My suggestion: put AFREC in the low bit of AC, and go through all
> >>occurrences of AC, replacing them with a macro to extract the right value
> >>from AC-plus-flag.
> >>
> >>Henry
> >>----------------------------------------------------------------------
> >>For information about J forums see http://www.jsoftware.com/forums.htm
> >----------------------------------------------------------------------
> >For information about J forums see http://www.jsoftware.com/forums.htm
> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jsource] Reference counts (and question for Roger)

Reply via email to