Oops, my mistake - I confused m and n. Thanks,
-- Raul On Thu, May 12, 2016 at 10:12 AM, Marshall Lochbaum <[email protected]> wrote: > Huh? Initially the argument to derec is zero. But for each other call, > it is m-1, where m is the reference count of some object. Reference > counts have to be positive (when an object's reference count hits zero, > it is destroyed), so m-1 is non-negative. > > Marshall > > On Thu, May 12, 2016 at 05:21:31AM -0400, Raul Miller wrote: >> Does the "derec" argument pattern make sense? >> >> As currently defined here "Calling derec(w,n), where w's reference >> count is initially m, decrements >> w's reference count by n. Then, if w has AFREC set, it calls >> derec(z,m-1) on each object referred to by z and removes the flag." >> >> But in the examples here, n is always 0 or negative. >> >> Wouldn't it be more understandable to instead say "Calling derec(w,n), >> where w's reference count is initially m, increments >> w's reference count by n. Then, if w has AFREC set, it calls >> derec(z,m+1) on each object referred to by z and removes the flag." >> (and then make the same changes everywhere to be consistent)? >> >> If this does not make sense then something important is missing from >> this description. >> >> Thanks, >> >> -- >> Raul >> >> >> On Wed, May 11, 2016 at 10:05 PM, chris burke <[email protected]> wrote: >> > From: *Marshall Lochbaum* <[email protected]> >> > Date: 29 April 2016 at 09:51 >> > To: [email protected] >> > >> > >> > I am currently working on changing J's current reference counting scheme >> > to a much more efficient and standard method. >> > >> > J stores a reference count in every object (AC(w)) indicating the number >> > of times it is referenced. The count is incremented when a copy is made >> > and decremented when one is deleted. Once the reference count drops to >> > zero, J knows the object can be freed, and does so. >> > >> > Currently, J uses an unusual recursive procedure for reference counts. >> > If a boxed array (or verb with arguments, or other object with children) >> > is copied, then the reference counts of all its children (items in >> > boxes) are incremented, and so on recursively. This makes copies of >> > composite objects take a long time. >> > >> > The typical procedure is to treat the parent/child relationship as a >> > single reference. Thus the reference counts of children are incremented >> > when they are added to the parent, and decremented when the parent is >> > deleted, and otherwise not changed. To copy a composite object, simply >> > increment its reference count, and to delete, decrement it. I am >> > currently trying to convince the J engine to use this procedure. >> > >> > The hangup is another idiosyncratic J feature, which in contrast to the >> > weird reference count scheme is actually useful. J maintains a stack of >> > local nouns, whose function (but not implementation) is equivalent to >> > that of C's stack. Any time an object is created, a pointer to it is >> > pushed to the stack. Thus J functions can ignore all the reference count >> > arithmetic (which from experience I know is quite painful) while using >> > objects. Instead, they call PROLOG, which saves the location of the >> > beginning of the stack, before doing anything, and EPILOG(z), where z is >> > the value to be returned, after everything. EPILOG copies z, deletes >> > everything added to the stack since PROLOG was called, and returns z. >> > >> > A complication arises if z is a composite object. In this case we want >> > to avoid deleting any of z's children, despite the fact that they may be >> > on the stack. In the current paradigm this is automatic: on copying z, >> > all of its children are also copied, and deleting the copies on the >> > stack leaves those copies. But if incrementing the reference count is >> > not recursive, this doesn't happen, and the children are deleted, >> > leaving z with dangling pointers. >> > >> > In a sense, the problem arises because when verbs connect z to its >> > children, they don't bother to inform the children. The reference from z >> > to its children cannot be reflected in their reference counts. So the >> > correct fix might be to increment the reference counts as they are added >> > to z. This is not feasible: the places where children are added to >> > objects just look like assignments, and are impossible to detect >> > automatically. >> > >> > There is a workaround: EPILOG should do the pointer-incrementing job for >> > us. So rather than a simple copy of z, it actually goes through all of >> > z's children and increments pointers. Specifically, it should increment >> > the descendents which are children of objects on the stack. This >> > requires an extra flag: AFSTK, which indicates whether an object is on >> > the stack. >> > >> > I have implemented all the above, and am still getting bugs. I think >> > that I have made all the necessary architecture changes, and that the >> > remainder is just a few spots that relied in some way on the old system, >> > but it's impossible to be sure how much work there is left to do. I will >> > update on future progress. >> > >> > Marshall >> > >> > ---------- >> > From: *Henry Rich* <[email protected]> >> > Date: 29 April 2016 at 12:40 >> > To: [email protected] >> > >> > >> > Your workaround seems sound. Let me know what you find the problems are. >> > >> > To see if I understand the situation, would this work: keep a long long >> > counter of PROLOGs, giving each PROLOG a unique number; store the current >> > number in the block header when a block is created. Then at EPILOG, >> > increment use-counts in the children of Z that have the stored value >> > matching the current number. This would add 8 bytes to the header, so I >> > don't like it, but it would keep you from having to match children-of-z >> > with children-of-the-noun-stack. >> > >> > Henry >> > >> > ---------- >> > From: *Henry Rich* <[email protected]> >> > Date: 29 April 2016 at 17:18 >> > To: [email protected] >> > >> > >> > Is one way to think of this problem that the use-counts in z need to be the >> > old recursive style, so that they can be freed by the epilogue? >> > >> > If so, a solution would be to traverse z, converting its use-counts to >> > recursive style (and incrementing - I guess that's what you referred to as >> > 'copying'); decrement them in recursive style; then traverse again, >> > converting back to nonrecursive style. >> > >> > Would that work? >> > >> > If so, a more elegant solution would be to have two use counts: a count >> > that has yet to be propagated to descendants, and a count that has been >> > propagated. The use-count routines would handle these correctly. EPILOG >> > would traverse to push unpropagated use-count to descendants, and then free >> > the blocks as before. The conversion back to nonrecursive style would not >> > be required, because both styles would coexist. >> > >> > Implementation: I have my eye on the AF() field, which seems to have just 4 >> > bits assigned - is that right? Reserve 6 bits for flags, and use the upper >> > bits for the unpropagated usecount. At the same time, make the >> > corresponding change to AC(), leaving 6 bits of flags and the rest >> > propagated usecount. >> > >> > >> > Henry >> > >> > >> > On 4/29/2016 12:51 PM, Marshall Lochbaum wrote: >> > >> > ---------- >> > From: *Marshall Lochbaum* <[email protected]> >> > Date: 29 April 2016 at 21:56 >> > To: [email protected] >> > >> > >> > I'm not sure whether the first solution would work. Converting between >> > the two styles is simple enough--recursively add or remove the reference >> > count of the parent from all of its descendents--but the overall >> > procedure is kind of complicated, and I still don't know exactly how the >> > stack is used all of the time. >> > >> > I don't think that nouns are ever copied while on the stack, just used >> > without regard for their reference count. So the second method is the >> > same as mine, but mine combines the two counts. >> > >> > I assume you mean AFLAG for the field. There are 17 extra flags, defined >> > only for verbs, further on in jtype.h ("type V flag values"). These end >> > at 2^24, leaving the top 32 bits untouched. But given how quickly we >> > seem to be proposing new flags, it may not be a good idea to use those. >> > >> > Marshall >> > >> > ---------- >> > From: *Henry Rich* <[email protected]> >> > Date: 30 April 2016 at 04:34 >> > To: [email protected] >> > >> > >> > [I thought the V flags were for the flag field of the V struct, not the A >> > struct.] >> > >> > I have better ideas about this problem now. >> > >> > First, though, let me say this. When there were still problems after you >> > put in the nonrecursive use-counts, and you suspected some code that relied >> > on the old methods: that's almost a fatal problem, isn't it? Because we >> > can't hope to find all the old code, and would never want to release until >> > we were sure we had. We need a design that guarantees compatibility. >> > >> > >> > >> > Here's what you have learned: execution freely allocates blocks, putting >> > them on the stack, and combines them promiscuously, producing objects, one >> > of which becomes the result z for use in EPILOG(z). The objects produced >> > by execution all have the old recursive use-counts, and thus must be freed >> > using the recursive method. >> > >> > Yet we would very much like objects to use nonrecursive use-counts. This >> > suggests the following design: >> > >> > Blocks are either nonrecursive or semirecursive and flagged to indicate >> > which kind they are. The use-count of a nonrecursive block has not fully >> > been included in the use-counts of its descendants, and need not be, while >> > the use-count of a semirecursive block has been propagated to descendants. >> > >> > The descendants of a nonrecursive block are all nonrecursive. The >> > descendants of a semirecursive block can be either type. >> > Incrementing/decrementing the use count involves traversing the tree, but >> > the traversal can be terminated when a nonrecursive block is encountered. >> > >> > Nonrecursive blocks are created ONLY by EPILOG(z). After the stack has >> > been freed, the result z is traversed to transform it from semirecursive to >> > nonrecursive. >> > >> > Important lemma (please check!): No block pointed to by the stack will ever >> > be contained in a nonrecursive block. Proof: All blocks are allocated as >> > semirecursive, and execution only combines these into larger semirecursive >> > aggregates. (replacing-in-place is not implemented for indirect data types >> > - yet). >> > >> > >> > >> > That's the design. It's saying that before EPILOG, anything goes. >> > Whatever the Old Code produces is OK. At EPILOG() time, the surviving z is >> > then put into efficient nonrecursive form, after the stack has been taken >> > care of. >> > >> > Examples: >> > >> > J sentence: a (; <@(+/)"1) b >> > >> > <@(+/)"1 will run on b to produce its result, in semirecursive form. >> > EPILOG will turn this into the nonrecursive result, call it c. >> > >> > a starts out nonrecursive (if it is an indirect type), because it is the >> > result of some earlier EPILOG. >> > >> > a ; c will join the two nonrecursive values into a semirecursive result. >> > But traversing that result during EPILOG will be fast, because the >> > traversal may stop after one level of recursion. >> > >> > J sentence: ({. a) ; 2 {:: b >> > >> > (2 {:: b) selects the subtree of b (and presumably increments its use >> > count, but that's in the Old Code and we don't have to worry about it). >> > This subtree will most likely be nonrecursive. Depending on the type of a, >> > the result of ({. a) is either a new semirecursive block or the address of >> > the nonrecursive subtree of a. These are joined by ; into a semirecursive >> > result, which again can be traversed quickly. The key point is that no >> > block on the stack can ever appear inside a nonrecursive block and thus >> > cannot cause trouble when the stack is freed. >> > >> > >> > >> > This design needs a flag NONRECURSIVE, but it does not need the ONSTACK >> > flag. >> > >> > Henry >> > >> > ---------- >> > From: *Marshall Lochbaum* <[email protected]> >> > Date: 30 April 2016 at 16:43 >> > To: [email protected] >> > >> > >> > I think this is the scheme I was grasping for. The recursive flag (I >> > called it AFREC) is just a better name for the stack flag--in fact >> > objects are returned to the stack after being pulled off in EPILOG, so >> > that name is wrong. The major change is realizing that ra and fa (copy >> > and delete copy) should work differently (recursively) on objects with >> > the flag set. >> > >> > It seems to work, but there are problems with the end of jtxdefn, which >> > has its own variant of EPILOG (jtxdefn is the only such function, as I >> > have verified by checking for uses of tpush outside m.c). This epilog >> > frees the symbol table for local variables, but it seems this is done >> > incorrectly as it causes invalid reads later on. It shouldn't take too >> > long to fix. >> > >> > And you're right about the V flags versus A flags. The perils of untyped >> > data... >> > >> > Marshall >> > >> > ---------- >> > From: *Henry Rich* <[email protected]> >> > Date: 30 April 2016 at 16:57 >> > To: [email protected] >> > >> > >> > That sounds promising. I think we're in agreement. >> > >> > Does EPILOG(z) put anything back onto the stack except for z? [I'm just >> > trying to make sure I keep up with how it works (but trying to avoid >> > looking through the code).] If it does, do you have to do anything >> > different when z is eventually freed? Not that there's a problem, since you >> > have the recursive flag to tell you what to do. >> > >> > I guess I'd better understand this code if I'm going to talk about it. I >> > would look in PROLOG, EPILOG, and where else to see how the stack is used? >> > >> > >> > If we wanted to save 8 bytes in the header, we could move those A flags >> > into the low bits of the use count. Not now, but someday... >> > >> > >> > If we couldn't deal with untyped data, we wouldn't be working in J, would >> > we? >> > >> > Henry >> > >> > ---------- >> > From: *Marshall Lochbaum* <[email protected]> >> > Date: 30 April 2016 at 17:22 >> > To: [email protected] >> > >> > >> > EPILOG(z) pushes z onto the stack (with tpush). Currently, this is >> > recursive, so it pushes all of z's descendents as well. In the new >> > paradigm, it isn't, and only z is added. >> > >> > PROLOG and EPILOG are very simple: >> > #define PROLOG I _ttop=jt->tbase+jt->ttop >> > #define EPILOG(z) R gc(z,_ttop) >> > A jtgc (J jt,A w,I old){ra(w); tpop(old); R tpush(w);} >> > >> > where ra copies w and tpush(w) adds it to the stack. tpop(old) pops >> > (with fr, which is not recursive) all the items after index old from the >> > stack. In the new version I have replace tpush with a different function >> > that derecursivinates the argument, then pushes it. >> > >> > Since I've written out most of it, I will probably make my comments on >> > memory management into a complete guide and post it somewhere soon. >> > >> > Marshall >> > >> > ---------- >> > From: *Henry Rich* <[email protected]> >> > Date: 1 May 2016 at 07:33 >> > To: [email protected] >> > >> > >> > I think this works, but I believe my lemma from last post is flawed (more >> > below). First, some advice _de profundis_. >> > >> > I urge you to write out your comments right now, before writing any (more) >> > code, as a way to prove the design before coding. This should take the >> > form of a short paper, included as commentary in the code, in which you >> > describe the principles of the design, the actions that can be performed, >> > and the assertions that can be made about the state of the system at >> > various times. Then, give a passably rigorous proof of the assertions. >> > >> > This is a good way to design, and I have the scars to prove it. Many times >> > I have had to tear up a design because the attempt to formally prove >> > correctness turned up flaws. Not only will you save coding time, you will >> > learn where you need to spend your testing effort. >> > >> > Of course, all who come after you will be grateful for the commentary. But >> > you yourself will get your greatest benefit from it the earlier you write >> > it. Post it here by all means when it's done, but makes sure it gets into >> > the code. >> > >> > >> > Now, about those memory blocks. Here's my current understanding. >> > >> > All allocated blocks bear the curse of Adam: burdened with Original Sin, >> > the time of their doom is marked on the stack at the moment of their >> > birth. Their use-count is 1, but that is borrowed time. After their >> > threescore and ten, they return, dust to dust, to the memory pool. >> > >> > All, that is, except one elect from each generation. This virtuous block, >> > z, is selected to carry the faith to the next generation. The use-counts >> > of z are incremented, expunging the Original Sin, and z survives the >> > reckoning. But only temporarily: z is then put back onto the stack, >> > reinstating the Original Sin, and z's added life is possibly for just one >> > generation, possibly for more, according to the will of the Lord. >> > >> > When z is put back onto the stack, it is as a born-again nonrecursive >> > block, in which the top block of z answers for all its descendants. >> > >> > Here is the question: are we sure z is up to the task? I thought last time >> > I had proved Yes, that z could not contain blocks that are on the stack >> > slated for destruction. But now I see that that is false: you could have >> > this scenario: >> > >> > Call level 1: PROLOG; create block A; pass block A into call level 2 >> > Call level 2: PROLOG; create block B, an indirect block that points to A; >> > EPILOG(B) >> > >> > The stack now points to A and B; B is marked nonrecursive, but what about >> > A? >> > >> > I think the answer is that it has to be OK, because Call level 2 must have >> > incremented the use-count of A when it incorporated A into B. That is, if >> > it didn't the system wouldn't work to begin with, so we don't have to be >> > concerned with the details of how that happens. >> > >> > This suggests a corollary concerning the nonrecursive flag: it indicates >> > 'this block has been through EPILOG and had its use-counts incremented.' >> > That inoculates it against having itself OR ANY COMPONENT destroyed before >> > its time. >> > >> > This scenario imposes a requirement on the derecursivinative procedure: it >> > must not make any attempt to collect use-counts and forward them to higher >> > levels. One might think it clever to notice that all the use-counts of the >> > children of a node are 2, and pull one use-count into the parent. It >> > _would_ be clever, except that it would leave the child exposed to >> > destruction from the stack or destruction of a name. >> > >> > [You should include the discussion of the previous paragraph, including the >> > issues, decisions, and alternatives, in your design document. It will help >> > readers greatly to understand the issues involved. Also include a proof of >> > the OR ANY COMPONENT statement two paragraphs back.] >> > >> > Henry >> > >> > ---------- >> > From: *Marshall Lochbaum* <[email protected]> >> > Date: 5 May 2016 at 12:02 >> > To: [email protected] >> > >> > >> > The problems I thought were due to the not-EPILOG in jtxdefn were >> > actually a result of a change I made to debug: reducing PLIM (and PLIML, >> > its logarithm) in m.c so that more objects would be allocated on the C >> > stack rather than J's pool. This makes more memory leaks detectable with >> > a tool like valgrind, since leaked objects in the pool are still seen as >> > reachable. Unfortunately and for reasons I don't understand, it also >> > introduces a number of bugs. >> > >> > The new memory scheme now passes a lot of tests, but fails a few. >> > Continuing investigation. >> > >> > Marshall >> > >> > ---------- >> > From: *Marshall Lochbaum* <[email protected]> >> > Date: 5 May 2016 at 13:52 >> > To: [email protected] >> > >> > >> > Here's the formal(ish) proof that managing memory derecursivinatively >> > doesn't break anything that wasn't already broken. There are two >> > assumptions about current the J behavior, which are collected at the >> > end. >> > >> > Definition: >> > A referrer is either a J object or the stack. An object references >> > another object if that object would be visited by a call to traverse >> > (with a non-recursive function as its argument). The stack is a list of >> > pointers; it references each object it points to. A referrer may >> > reference an object multiple times if it is visited multiple times by >> > traverse or appears multiple times in the stack. The list of objects it >> > refers to are its referents. >> > >> > Code specification: >> > Every object is created with AFREC set and reference count zero. Calling >> > EPILOG(z) copies z, then pops from the stack, then calls derec(z,0). >> > Calling derec(w,n), where w's reference count is initially m, decrements >> > w's reference count by n. Then, if w has AFREC set, it calls >> > derec(z,m-1) on each object referred to by z and removes the flag. >> > There is no other way to remove the AFREC flag, and no way to add the >> > AFREC flag at all. >> > >> > derec(z,0) is equivalent to derec1(z), which uses more traversals: >> > derec1(z) does nothing if z does not have AFREC set, and otherwise >> > decrements the counts of each of z's descendants by one less than z's >> > reference count, then recurses. >> > >> > Invariant 1: >> > At all times except during calls to derec, an object z that refers to an >> > object marked AFREC is also marked AFREC. Equivalently, descendants of >> > non-recursive objects are never recursive. >> > Assume: Descendants are added only to objects not marked AFREC. >> > Proof: When z is created, it has no descendants, so the property holds. >> > Three events can potentially falsify the invariant: >> > - A new descendant is added to z. If z does not have AFREC set, then >> > descendants cannot be added to it, by the assumption. If it does have >> > AFREC set, the invariant holds regardless. >> > - The AFREC flag is added to a descendant of z. This is impossible. >> > - The AFREC flag is removed from z--that is, derec is called on z, and >> > AFREC is set for z. In this case, derec removes the flag and recurses. >> > But then the flag will be removed for each descendant as well, and the >> > invariant still holds. >> > >> > Invariant 2: >> > The reference count of an object is the sum over all referrers (with >> > multiplicity) of either 1, if the referrer is the stack or an object not >> > marked AFREC, or that object's reference count, if the referrer is an >> > object marked AFREC. >> > Except during the execution of memory functions, this invariant holds >> > - At all times for objects not marked AFREC, and >> > - When EPILOG(z) is called for objects z marked AFREC and their >> > descendants. >> > We assume the invariant in the second case. If not, the old system was >> > broken. >> > We now show that derec1(z), equivalent to the main work of EPILOG >> > derec(z,0), does not falsify the invariant. We omit the proof that other >> > memory operations (ra, fa, tpop) satisfy it, as it is simple in each >> > case. >> > - If AFREC is not set, then derec1(z) does nothing by definition. >> > - If AFREC is set, then derec1(z) removes the AFREC flag from z, which >> > decreases the corresponding terms in the reference formula from m to >> > 1, where m is the reference count of z. It also decrements the counts >> > of z's descendants, to the same effect. Note that the formula counts >> > referrers with multiplicity, and traverse does the same. Then derec1 >> > recurses, which maintains our invariant proveded the rest of the >> > function does as well. >> > >> > >> > So, our assumptions about J's code base are: >> > >> > 1. Descendants are only added to objects marked AFREC. >> > This could be violated by some things like in-place amend. If it is, it >> > will break invariant 1, which is easy to test for in derec by doing a >> > full recurse. The fix is to make sure each addition calls derec on the >> > added objects. >> > >> > 2. When EPILOG(z) is called, z and its descendants obey the reference >> > count formula. >> > Mostly, this depends on J working already. The only way the changes can >> > break it is if a function does strange memory stuff (not fa or ra) to an >> > object not marked AFREC. But that's not really out of the question. >> > >> > There are about 40 failing tests right now, and they're probably due to >> > specific functions that break one of the invariants. Hopefully there are >> > a lot fewer than 40 functions causing these failures. >> > >> > Marshall >> > >> > ---------- >> > From: *Henry Rich* <[email protected]> >> > Date: 10 May 2016 at 16:28 >> > To: [email protected] >> > >> > >> > Just starting to look at this. >> > >> > Creating a block with AFREC set and reference count 0 may be trouble. I >> > see in places where the code checks for AC(w)>1 to indicate that a block >> > has been put into a boxed structure. That would not work in the new >> > system. >> > >> > My suggestion: put AFREC in the low bit of AC, and go through all >> > occurrences of AC, replacing them with a macro to extract the right value >> > from AC-plus-flag. >> > >> > Henry >> > ---------------------------------------------------------------------- >> > For information about J forums see http://www.jsoftware.com/forums.htm >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
