After finding a major mismatch between the model I outlined and how J
actually works, I have fixed all remaining bugs with the new memory
model. The only work left now is to rewrite 7!:0 when not all objects
reside on the stack. Long explanation of the changes follows. They've
been pushed to the refcount branch again.
In my original proof, I assumed that
2. When EPILOG(z) is called, z and its descendants obey the reference
count formula.
This is actually never true, except in the trivial case that z has no
descendants. Instead, the following holds:
2. After executing the funny series of operations in EPILOG that come
before derec, z and those of its descendants popped by tpop obey the
reference count formula.
Not as good. Let's look at this series of operations (best viewed in
monospace):
initial state (b is a descendent of a)
| |
a 1 -> b 1
ra(a)
| |
a 2 -> b 2
tpop(old)
a 1 -> b 1
tpush(a)
|
a 1 -> b 2
derec(a)
| |
a.1 -> b.2
The vertical bars are references from the stack to objects. The number
next to each object is its reference count (with a dot for non-recursive
counts), and arrows between them are references. The reference count
formula states that the count of each object is equal to the sum, for
each referrer that refers to it, of 1 if that referrer is the stack or a
non-recursive object, and that object's reference count if it is a
recursive object. This formula is wrong until after tpush! Before tpop,
b's reference count is too high, and after it a's count is too high.
Here's a problem that occurred in the tests various times. It happens if
b and c are created in a function with no EPILOG, and then a is created
in a function with an EPILOG.
initial (assume only a is above old on the stack)
| | |
a 1 -> b 1 -> c 1
ra(a)
| | |
a 2 -> b 2 -> c 2
tpop(old)
| |
a 1 -> b 2 -> c 2
tpush(a)
| | |
a 1 -> b 2 -> c 2
derec(a)
| | |
a.1 -> b.2 -> c.1
Our formula is wrong all the way through! Because b is never popped from
the stack, its reference count is higher than 1 when tpush is called,
and this isn't reflected in c's reference count, which should be 3. So
when derec is called, it removes excess references from c which were
never there, and c's count ends up being one too low. Once b and c are
taken off the stack (say at the end of the current sentence), b's count
goes to 1 but c's goes to 0, and c is freed leaving a dangling
reference.
The problem was that I had treated the discrepancy in reference counts
like an "implicit reference" from a to b. Thus to fix it, we should call
ra(a), making it into a real reference, and clear the stack under it.
But this ra call never happens for objects which aren't EPILOGged. Their
descendants end up with no indication of the reference. Instead, I need
a different reference count formula, which will actually hold when
EPILOG is called. This requires a new flag, although there are perhaps
other methods in which it wouldn't.
We define the flag AFCLR. The absence of this flag indicates that the
stack reference is not propagated to descendants (this only makes sense
for recursive objects. The AFCLR flag will always be on for
non-recursive objects). Initially AFCLR is off. There are two situations
in which we should turn AFCLR on for object w, and the discrepancy is
corrected in each case:
- w is popped from the stack. tpop uses fa, which non-recursively
decreases w's reference count.
- w is made non-recursive. We account for the missing reference by
decrementing the counts of w's descendants by one less than we usually
would.
Here's how that works for our problem scenario before. Objects with
AFCLR and AFREC set are marked with *.
initial
| | |
a 1 -> b 1 -> c 1
ra
| | |
a 2 -> b 2 -> c 2
tpop
| |
a*1 -> b 2 -> c 2
tpush
| | |
a*1 -> b 2 -> c 2
derec
| | |
a.1 -> b.2 -> c.2
You can verify that the modified reference count formula is true all the
way through. The reference count of each object is the sum of the counts
of its referrers, minus one for each referrer without a *. Once derec is
called, c's count of 2 matches with its two referrers.
Marshall
On Mon, May 23, 2016 at 06:29:13PM -0400, Marshall Lochbaum wrote:
I have modified cp.c, cx.c, and vgauss.c and am back to no new test
failures. Test failures in all three sparse tests have mysteriously
vanished. This is not incredibly surprising because those failures could
have been caused by omitting gc on the simpler functions, some of which
deal with sparse arrays. However, it would be nice to get Valgrind
working to see if there are use-after-frees which are still there but
aren't causing any bugs in those particular tests.
For cx.c and vgauss.c I just replaced gc with a version that doesn't use
derec. This is essentially the same as the old functionality except that
objects are not pushed to the stack recursively, so we still get some
improvement in performance. For cp.c I actually rewrote the relevant
operations.
It turns out it's not actually safe to call derec on an object on the
stack of unknown origin, because it may have implicit references to
children which are on the stack. The only known safe way to
derecursivize an object is with gc (or gc3), which is safe by
construction. Here are the rules for performing a running update of
object x:
- Ensure that x is not recursive. Calling derec immediately after GA is
safe, since x has no children yet.
- After creating a value z which will be added to x, call gc(z,old).
Stack index old should be should be set late enough so that nothing
important is clobbered (in particular, the original copy of x).
- Before adding z to x, call ra(z). Since ra(z) returns z, you can just
write xv[k]=ra(z) or similar.
Changes have been pushed to the refcount branch again. Remaining
problems are that 7!:0 doesn't work and that tests g13x, g5x2, and gq
fail.
Marshall
On Mon, May 23, 2016 at 04:17:52PM -0400, Marshall Lochbaum wrote:
w should be made non-recursive before doing anything else--otherwise we
would have to keep track of whether it is or not during the loop. So
adding it to the stack just increments the reference count, and popping
it decrements it again. No danger of messing with child reference counts
or anything.
I made the changes to the two files I mentioned, and naturally there are
still test failures beyond those with no-op gc. They shouldn't take too
long to work out, though. I'm tired enough of actually reading code that
I think I'll just do a binary search on instances of gc...
Marshall
On Mon, May 23, 2016 at 03:59:25PM -0400, Henry Rich wrote:
You are right, provided you are sure w is off the stack. If you can
be sure of that, it's better to derec the child as you add it.
I also answered my question about the use of jt->tbase.
Henry
On 5/23/2016 1:56 PM, Marshall Lochbaum wrote:
I don't see why tpopnotw would be any better than just leaving w off the
stack (or, perhaps, why leaving w off the stack is flawed). The only
reason these functions break our current system is that they add a
recursive object as a child of a non-recursive one. So calling derec on
the children solves the problem, and any other changes are for
performance. And tpopexcept is clearly going to be slower than just
tpop. So what's the benefit?
Marshall
On Sun, May 22, 2016 at 12:17:19PM -0400, Henry Rich wrote:
no, tpopnotw would not leave the w dangling - it would move it (see
my latest). It would work like this (you will have to reflect any
changes you made to tpop/tpush):
tpopexcept(old,w){
I wfound = 0; // set if there was a free of w
while(old<jt->tbase+jt->ttop){ // till we have freed back to 'old'
//?? question: why is 'old <' used rather than 'old !='? Aren't the
// blocks of the stack in accidental order, so that this test
would fail
// if the stack spans a block boundary, and the second block is at a
// lower address than the first?
if(1<jt->ttop) { // if the stack item is a pointer to a free block
A blocktofree = jt->tstack[--jt->ttop]; // fetch address to free
if(blocktofree==w)++wfound;else fr(blocktofree); // free it,
unless it's w
elsetf(); // stack item is a stack-block chain pointer - free
the stack block
}
DO(wfound, tpush(w)); // restore any occurrences of w that were
suppressed
R old; // return free-to pointer - it may not be the top, if w was
restored
}
I agree that if it's w being changed, nothing special is needed, but
this would be a good idea for the other places.
What testcase would you like me to look at first?
Henry
On 5/22/2016 11:52 AM, Marshall Lochbaum wrote:
Replying to both of your comments here.
Almost all uses of this pattern actually replace w rather than append to
it. I haven't tested, but I think it's fine in all of these cases to use
gc or gc3. The exceptions which do modify w are in cp.c and vgauss.c,
although it's possible I missed some.
tpopnotw isn't possible--the stack is stored in an array, so ignoring w
would just leave it dangling beyond the end of the stack. Your first
suggestion still works, except that derec, rather than ra, makes objects
non-recursive.
I've pushed my changes so far to the unbox branch refcount:
https://github.com/iocane/unbox/commits/refcount
Failing tests are
- Because 7!:0 is no longer accurate
core/g202.ijs
core/g202b.ijs
core/g7x.ijs
core/glocale.ijs
- Problem with in-place sparse amends
sparse/gsp530i.ijs
sparse/gsp530l.ijs
sparse/gsp530n.ijs
- Miscellaneous
core/g13x.ijs
core/g5x2.ijs
core/gq.ijs
Presumably the last two groups are use-after-frees. Standard tools don't
work to identify these because of J's pool allocation, and making J use
mallocs instead of pool allocations causes bugs for reasons that are
probably even harder to track down. I have tried to configure Valgrind
to recognize pool allocations (using the requests described at
http://www.network-theory.co.uk/docs/valgrind/valgrind_57.html), but it
seems to only recognize that certain reads/writes are invalid, not the
free that caused them to be invalid. Useless.
Marshall
On Sat, May 21, 2016 at 04:29:03PM -0400, Henry Rich wrote:
store the current stack top in variable old.
for(...) {
add something to boxed array w.
call gc(w,old), putting w in non-recursive form.
}
What I was suggesting is:
store the current stack top in variable old.
ra(w);
for(...) {
add something to boxed array w.
ra(something); // putting something in non-recursive form.
tpop(old);
}
tpush(w);
Now I suggest
store the current stack top in variable old.
for(...) {
add something to boxed array w.
ra(something); // putting something in non-recursive form?
tpopnotw(old,w);
}
Where tpopnotw is like tpop but simply ignores stack entries that match w.
If this idea is sound, I think you might as well do it, because
* you will have to inspect each place gc() is used anyway
* assuming that the current code is there because the free was needed,
this preserves that behavior. It will be hard to be confident that
this is not needed (you'd have to deploy and wait for user complaints -
I wouldn't want to take that chance)
* No usecount will be modified or block freed that doesn't need to be
* it is necessary to increment the usecount of (something)
Henry
On 5/20/2016 10:06 AM, Marshall Lochbaum wrote:
A number of spots in the J source follow this pattern, or similar:
store the current stack top in variable old.
for(...) {
add something to boxed array w.
call gc(w,old), putting w in non-recursive form.
}
(A particular example is cp.c, lines 91--97, which performs (v^:l x)
where l is an integer list).
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm