Re: [Jsource] Success!

Henry Rich Wed, 25 May 2016 18:27:12 -0700

I would like to have a Guide For Coders giving the rules for using thememory system to perform specific tasks, such as

1. Creating an array of boxes
2. Modifying an array of boxes
3. Including a preexisting box in an array of boxes
4. Creating/modifying/including other indirect forms, such as XINT and RAT

Now that it looks like this is going to work, we need to think aboutdeployment. We need to convince ourselves that there is no problem.The debug mode I mentioned below seems to me the way to go - items 1 & 2for sure. And we could also do

5. The memory system will keep a list of all the A blocks that areactive, and from time to time go through them, checking forconsistency. Each block has enough flags in it to tell what kind ofblock it is; the audit would ensure that pointers to A blocks point onlyto valid A blocks. This test would not be 100% thorough, since somefields have variable use, but enough would be checkable to give us areal feeling of security. In particular, it would check boxed arrays.A B-tree would be the best data structure for the block list, I reckon.

Would you create such a debug mode for the memory system? I can help ifyou like.


Henry






On 5/25/2016 6:36 PM, Henry Rich wrote:

I haven't got into this in detail yet, but I am wondering what toolswe could have to verify correct operation. We need a debug mode forthe memory system. Maybe we have them already.
1. In debug mode the allocator should scramble the contents of amemory block when it is freed.
2. In debug mode each freed-and-scrambled block should be checksummedso that when it is next allocated it can be checked to see that therewere no stores to it.
3. Debug mode could alter the order of the queue of free blocks, tochange its FIFO/LIFO character.
4. How hard would it be to keep track of the number of bytes thatshould be allocated? What can tie up memory long-term? I can think of
* assigned names
* preparsed scripts
* locales including their symbol tables
* 15!:3 and 15!:4
If the list is not too long, couldn't we keep track of how many bytesshould be assigned to each use, and do a consistency check from timeto time (at each entry to immex for sure, but more often if possible)?
Henry

On 5/25/2016 12:51 PM, Marshall Lochbaum wrote:
I assume you mean derecursivize (derec), not dereference. With the new
formula and tweak to derec, I think it is safe to derec an object at any
time. However, it is important to wait until EPILOG is called to do a
derec, since it isn't safe to add a descendant to a non-recursive object
and a J function can add a descendant at any time before EPILOG.

Here's the last example I gave with derec moved to the beginning rather
than the end. Since a and b have one reference, which isn't propagated,
before calling derec, they have zero propagated references and derec
actually adds one to the reference count of each descendant.

   initial
   |      |      |
   a 1 -> b 1 -> c 1

   derec(a)
   |      |      |
   a.1 -> b.2 -> c.2

   ra(a)
   |      |      |
   a.2 -> b.2 -> c.2

   tpop(old)
          |      |
   a.1 -> b.2 -> c.2

   tpush(a)
   |      |      |
   a.1 -> b.2 -> c.2

Here ra->tpop->tpush doesn't appear to do anything, but it actually
clears all of the unused objects from the stack, which is the entire
purpose of EPILOG. ra should be thought of as making a temporary copy of
a (so that a's reference count appears one too high after ra and tpop),
and tpush should be thought of as transferring that reference to the
stack.

This order doesn't use AFCLR at all, or rather AFCLR is on exactly when
AFREC is off. It may be that with this order we can get rid of AFCLR.
I'll have to think about whether that is always the case.

Marshall

On Wed, May 25, 2016 at 12:27:12PM -0400, Raul Miller wrote:
If I understand your essay here correctly (and please tell me if I do
not), the only time it is safe to dereference objects based on
reference counts is late in EPILOG?

(Or is the impact more subtle than that? Technically, all we need is a
guarantee that the reference count only falls to zero when it is safe
to recover its memory, and another "guarantee" that that will usually
happen in a reasonable time frame. And if the reference count is "too
high" or even "too low but not zero" for pending calculations in older
parts of the stack, we can sort of ignore that.... but that's as far
as I can see into this... and I might be overlooking some important
perspective.)

Thanks,

--
Raul



On Wed, May 25, 2016 at 12:06 PM, Marshall Lochbaum
<[email protected]> wrote:
After finding a major mismatch between the model I outlined and how J
actually works, I have fixed all remaining bugs with the new memory
model. The only work left now is to rewrite 7!:0 when not all objects
reside on the stack. Long explanation of the changes follows. They've
been pushed to the refcount branch again.

In my original proof, I assumed that

2. When EPILOG(z) is called, z and its descendants obey the reference
    count formula.

This is actually never true, except in the trivial case that z has no
descendants. Instead, the following holds:

2. After executing the funny series of operations in EPILOG that come
before derec, z and those of its descendants popped by tpopobey the
    reference count formula.

Not as good. Let's look at this series of operations (best viewed in
monospace):

   initial state (b is a descendent of a)
   |      |
   a 1 -> b 1

   ra(a)
   |      |
   a 2 -> b 2

   tpop(old)

   a 1 -> b 1

   tpush(a)
   |
   a 1 -> b 2

   derec(a)
   |      |
   a.1 -> b.2

The vertical bars are references from the stack to objects. The number
next to each object is its reference count (with a dot fornon-recursive
counts), and arrows between them are references. The reference count
formula states that the count of each object is equal to the sum, for
each referrer that refers to it, of 1 if that referrer is the stackor a
non-recursive object, and that object's reference count if it is a
recursive object. This formula is wrong until after tpush! Beforetpop,
b's reference count is too high, and after it a's count is too high.
Here's a problem that occurred in the tests various times. Ithappens ifb and c are created in a function with no EPILOG, and then a iscreated
in a function with an EPILOG.

   initial (assume only a is above old on the stack)
   |      |      |
   a 1 -> b 1 -> c 1

   ra(a)
   |      |      |
   a 2 -> b 2 -> c 2

   tpop(old)
          |      |
   a 1 -> b 2 -> c 2

   tpush(a)
   |      |      |
   a 1 -> b 2 -> c 2

   derec(a)
   |      |      |
   a.1 -> b.2 -> c.1
Our formula is wrong all the way through! Because b is never poppedfrom
the stack, its reference count is higher than 1 when tpush is called,
and this isn't reflected in c's reference count, which should be 3. So
when derec is called, it removes excess references from c which were
never there, and c's count ends up being one too low. Once b and c are
taken off the stack (say at the end of the current sentence), b'scount
goes to 1 but c's goes to 0, and c is freed leaving a dangling
reference.

The problem was that I had treated the discrepancy in reference counts
like an "implicit reference" from a to b. Thus to fix it, we shouldcall
ra(a), making it into a real reference, and clear the stack under it.
But this ra call never happens for objects which aren't EPILOGged.Theirdescendants end up with no indication of the reference. Instead, Ineed
a different reference count formula, which will actually hold when
EPILOG is called. This requires a new flag, although there are perhaps
other methods in which it wouldn't.

We define the flag AFCLR. The absence of this flag indicates that the
stack reference is not propagated to descendants (this only makessense
for recursive objects. The AFCLR flag will always be on for
non-recursive objects). Initially AFCLR is off. There are twosituations
in which we should turn AFCLR on for object w, and the discrepancy is
corrected in each case:

- w is popped from the stack. tpop uses fa, which non-recursively
   decreases w's reference count.
- w is made non-recursive. We account for the missing reference by
decrementing the counts of w's descendants by one less than weusually
   would.

Here's how that works for our problem scenario before. Objects with
AFCLR and AFREC set are marked with *.

   initial
   |      |      |
   a 1 -> b 1 -> c 1

   ra
   |      |      |
   a 2 -> b 2 -> c 2

   tpop
          |      |
   a*1 -> b 2 -> c 2

   tpush
   |      |      |
   a*1 -> b 2 -> c 2

   derec
   |      |      |
   a.1 -> b.2 -> c.2
You can verify that the modified reference count formula is trueall theway through. The reference count of each object is the sum of thecountsof its referrers, minus one for each referrer without a *. Oncederec is
called, c's count of 2 matches with its two referrers.

Marshall

On Mon, May 23, 2016 at 06:29:13PM -0400, Marshall Lochbaum wrote:
I have modified cp.c, cx.c, and vgauss.c and am back to no new test
failures. Test failures in all three sparse tests have mysteriously
vanished. This is not incredibly surprising because those failurescouldhave been caused by omitting gc on the simpler functions, some ofwhich
deal with sparse arrays. However, it would be nice to get Valgrind
working to see if there are use-after-frees which are still there but
aren't causing any bugs in those particular tests.
For cx.c and vgauss.c I just replaced gc with a version thatdoesn't usederec. This is essentially the same as the old functionalityexcept that
objects are not pushed to the stack recursively, so we still get some
improvement in performance. For cp.c I actually rewrote the relevant
operations.

It turns out it's not actually safe to call derec on an object on the
stack of unknown origin, because it may have implicit references to
children which are on the stack. The only known safe way to
derecursivize an object is with gc (or gc3), which is safe by
construction. Here are the rules for performing a running update of
object x:
- Ensure that x is not recursive. Calling derec immediately afterGA is
   safe, since x has no children yet.
- After creating a value z which will be added to x, call gc(z,old).
Stack index old should be should be set late enough so thatnothing
   important is clobbered (in particular, the original copy of x).
- Before adding z to x, call ra(z). Since ra(z) returns z, you canjust
   write xv[k]=ra(z) or similar.

Changes have been pushed to the refcount branch again. Remaining
problems are that 7!:0 doesn't work and that tests g13x, g5x2, and gq
fail.

Marshall

On Mon, May 23, 2016 at 04:17:52PM -0400, Marshall Lochbaum wrote:
w should be made non-recursive before doing anythingelse--otherwise we
would have to keep track of whether it is or not during the loop. So
adding it to the stack just increments the reference count, andpoppingit decrements it again. No danger of messing with child referencecounts
or anything.
I made the changes to the two files I mentioned, and naturallythere arestill test failures beyond those with no-op gc. They shouldn'ttake toolong to work out, though. I'm tired enough of actually readingcode that
I think I'll just do a binary search on instances of gc...

Marshall

On Mon, May 23, 2016 at 03:59:25PM -0400, Henry Rich wrote:
You are right, provided you are sure w is off the stack. If youcan
be sure of that, it's better to derec the child as you add it.

I also answered my question about the use of jt->tbase.

Henry

On 5/23/2016 1:56 PM, Marshall Lochbaum wrote:
I don't see why tpopnotw would be any better than just leavingw off thestack (or, perhaps, why leaving w off the stack is flawed). Theonly
reason these functions break our current system is that they add a
recursive object as a child of a non-recursive one. So callingderec on
the children solves the problem, and any other changes are for
performance. And tpopexcept is clearly going to be slower thanjust
tpop. So what's the benefit?

Marshall

On Sun, May 22, 2016 at 12:17:19PM -0400, Henry Rich wrote:
no, tpopnotw would not leave the w dangling - it would move it(seemy latest). It would work like this (you will have to reflectany
changes you made to tpop/tpush):

tpopexcept(old,w){
  I wfound = 0; // set if there was a free of w
while(old<jt->tbase+jt->ttop){ // till we have freed backto 'old'//?? question: why is 'old <' used rather than 'old !='?Aren't the// blocks of the stack in accidental order, so that thistest
would fail
// if the stack spans a block boundary, and the secondblock is at a
      // lower address than the first?
if(1<jt->ttop) { // if the stack item is a pointer to afree blockA blocktofree = jt->tstack[--jt->ttop]; // fetch addressto freeif(blocktofree==w)++wfound;else fr(blocktofree); // freeit,
unless it's w
elsetf(); // stack item is a stack-block chain pointer -free
the stack block
  }
DO(wfound, tpush(w)); // restore any occurrences of w thatwere
suppressed
R old; // return free-to pointer - it may not be the top,if w was
restored
}
I agree that if it's w being changed, nothing special isneeded, but
this would be a good idea for the other places.

What testcase would you like me to look at first?

Henry


On 5/22/2016 11:52 AM, Marshall Lochbaum wrote:
Replying to both of your comments here.
Almost all uses of this pattern actually replace w ratherthan append toit. I haven't tested, but I think it's fine in all of thesecases to usegc or gc3. The exceptions which do modify w are in cp.c andvgauss.c,
although it's possible I missed some.
tpopnotw isn't possible--the stack is stored in an array, soignoring wwould just leave it dangling beyond the end of the stack.Your firstsuggestion still works, except that derec, rather than ra,makes objects
non-recursive.

I've pushed my changes so far to the unbox branch refcount:
https://github.com/iocane/unbox/commits/refcount

Failing tests are

- Because 7!:0 is no longer accurate
     core/g202.ijs
     core/g202b.ijs
     core/g7x.ijs
     core/glocale.ijs

- Problem with in-place sparse amends
     sparse/gsp530i.ijs
     sparse/gsp530l.ijs
     sparse/gsp530n.ijs

- Miscellaneous
     core/g13x.ijs
     core/g5x2.ijs
     core/gq.ijs
Presumably the last two groups are use-after-frees. Standardtools don'twork to identify these because of J's pool allocation, andmaking J usemallocs instead of pool allocations causes bugs for reasonsthat areprobably even harder to track down. I have tried to configureValgrind
to recognize pool allocations (using the requests described at
http://www.network-theory.co.uk/docs/valgrind/valgrind_57.html),but itseems to only recognize that certain reads/writes areinvalid, not the
free that caused them to be invalid. Useless.

Marshall

On Sat, May 21, 2016 at 04:29:03PM -0400, Henry Rich wrote:
   store the current stack top in variable old.
   for(...) {
      add something to boxed array w.
     call gc(w,old), putting w in non-recursive form.
   }

What I was suggesting is:

    store the current stack top in variable old.
    ra(w);
    for(...) {
      add something to boxed array w.
ra(something); // putting something in non-recursiveform.
      tpop(old);
   }
    tpush(w);

Now I suggest

    store the current stack top in variable old.
    for(...) {
      add something to boxed array w.
ra(something); // putting something in non-recursiveform?
      tpopnotw(old,w);
   }
Where tpopnotw is like tpop but simply ignores stack entriesthat match w.
If this idea is sound, I think you might as well do it, because

* you will have to inspect each place gc() is used anyway
* assuming that the current code is there because the freewas needed,this preserves that behavior. It will be hard to beconfident thatthis is not needed (you'd have to deploy and wait foruser complaints -
    I wouldn't want to take that chance)
* No usecount will be modified or block freed that doesn'tneed to be
* it is necessary to increment the usecount of (something)

Henry












On 5/20/2016 10:06 AM, Marshall Lochbaum wrote:
A number of spots in the J source follow this pattern, orsimilar:
   store the current stack top in variable old.
   for(...) {
     add something to boxed array w.
     call gc(w,old), putting w in non-recursive form.
   }
(A particular example is cp.c, lines 91--97, whichperforms (v^:l x)
where l is an integer list).
----------------------------------------------------------------------For information about J forums seehttp://www.jsoftware.com/forums.htm
----------------------------------------------------------------------For information about J forums seehttp://www.jsoftware.com/forums.htm
----------------------------------------------------------------------For information about J forums seehttp://www.jsoftware.com/forums.htm
----------------------------------------------------------------------For information about J forums seehttp://www.jsoftware.com/forums.htm
----------------------------------------------------------------------For information about J forums seehttp://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jsource] Success!

Reply via email to