Re: [JS-internals] Approach for adapting SpiderMonkey to moving GC

Steve Fink Fri, 13 Jul 2012 11:06:39 -0700

(apologies if you receive this twice; trying to fix 2 sending failures)

My current leaning:


 - Use the current scheme of Rooted<T>, MutableHandle<T>, and Handle<T>
 - Add in an Unrooted<T>
 - Eliminate pretty much all bare pointers from the engine (except in
some of the GC code)
 - Convert all member functions that can GC into static methods

So all return values would be Unrooted<T>. Parameters to functions that
can GC will be Handle<T>; parameters to functions that cannot GC will be
Unrooted<T>. All of the internal (and external) functions will take only
Unrooted<T>, Handle<T>, or MutableHandle<T>, and none of those will have
a public T* constructor, so the compiler will prevent bare pointers from
being used anywhere that matters. You can tell at a glance whether
something can GC or not depending on its parameter types, except for
functions that don't take any gcthings, and I've half a mind to add an
extra parameter to those. It would do nothing but mark that it can GC.
(Handle<void>?) Alternatively, we could just be rigorous about adding
AssertRootingUnnecessary guards to these, but that's dynamic and I'd
really like the "does it GC?" question to be trivially answered from
inspection. Or I guess we could use a naming convention.

I was intending to try this out today. There's probably a fatal flaw in
this that the compiler will tell me about, but how does it sound to people?

It still doesn't statically prevent people from holding an Unrooted<T>
live across something that can GC, but at least it makes it a little
more obvious, and the dynamic analysis will spot most of these. The most
worrisome remaining pattern I can think of is

  Unrooted<T> x = f(cx);
  rarelyThrowException(cx);
  x->simpleNonGCingFunction();

The dynamic analysis can only catch this if we make
rarelyThrowException() do a CheckStackRoots(). I guess even worse would be:

  Unrooted<T> x = f(cx);
  rarelyThrowException(cx);
  if (unlikely)
    x->simpleNonGCingFunction(cx);

since the dynamic analysis would probably *not* catch this.

Another hole is when we use the pointer value to compute a hash code,
which we unfortunately do pretty often. For JSObjects, at least, it
would be really nice if we had an identity value to use instead.
dmandelin and I talked about this, and initially I assumed this would
require bloating up every JSObject, but David pointed out that we could
add it on as an optional property (with a private name?) so that it
would only affect objects used as keys. Is this feasible? And what other
gcthings would we need something analogous for?

I'd also like to collect nontrivial rooting examples that people have
encountered. For example, I ran into one that looked roughly like:

  struct Key { JSObject* proto; Type type };
  Key key = { computeProto(), computeType(); }
  AddPtr p = table.lookupForAdd(key);
  if (!p) {
    somethingThatMightGC();
    table.relookupOrAdd(p, key, val);
  }

where the hashcode for Key uses the pointer values of its members. The
problem is that the AddPtr could get invalidated by
somethingThatMightGC, and then we'd be using a stale hashcode in
relookupOrAdd. The simple solution would be to always re-do the lookup,
but 99% of the time the Key's pointers *won't* change. My solution was
to keep a copy of the original pointers, and only repeat lookupForAdd if
they changed. It feels a little brittle, but at least the weirdness is
locally contained. (Note that changing the fields' types to Rooted
variants would be invalid for 2 reasons: first, the Key gets stored on
the heap in the hashtable; and second, because that would just update
the pointers but not the AddPtr's hashcode. You'd need a RootedAddPtr
with a funky update rule to fix that.)

I'll need to dig through some patches to find more examples, but what
tricky situations have other people encountered?

On Thu 12 Jul 2012 07:15:54 PM PDT, Bill McCloskey wrote:
> I didn't respond to Dave's email right away because I wasn't sure how
> I felt, and I had a lot of questions about how rooting works in
> V8. After looking at it more closely, here's what I think:
>
> 1. I don't think that adopting V8's handle system is a silver bullet.
>
> 2. I think there are multiple dimensions that we can use to compare
> V8's way of rooting with SpiderMonkey's way of rooting. I don't think
> either way is strictly better.
>
> 3. I think we're generally moving in the right direction, and the most
> important thing is to move forward. I think a lot of the current fear
> about the rooters stems from the fact that they're incomplete, and
> that makes the whole system look more error-prone than it is.
>
> I'll address these points in turn.
>
> ---- V8's handle system ----
>
> If we wanted to use V8-style handles in SpiderMonkey, we would have to
> add a lot of HandleScopes throughout the JS engine. When you create a
> handle, it takes space in the nearest enclosing HandleScope until that
> HandleScope is destroyed. So any place we have an unbounded loop in
> the JS engine, we would have to make a new HandleScope to ensure that
> we didn't use an unbounded amount of memory storing handles. This
> seems annoying and error-prone. Especially since each time you add a
> HandleScope you have to reason about what handles might escape it.
>
> The other issue is performance. As far as I can tell, V8 doesn't use
> handles on any of their property access paths. They use handles during
> code generation, and in some runtime functions. Everywhere else they
> use bare pointers. In some places they assert that no allocation
> happens. In other places, they use a complicated scheme of GCing and
> then retrying an operation if an allocation fails.
>
> In SpiderMonkey, the separation between code that can GC and code that
> can't isn't nearly as clear as it is in V8. If we adopted handles
> without doing anything else, we would have to add handles to a lot
> more places than V8 does, and our performance would certainly suffer.
>
> I did some benchmarking of our current rooters. As far as I can tell,
> almost all the cost of them is on stubs paths like GetElem and
> GetProperty. V8 doesn't use handles at all in these places. Even with
> the limited set of rooters we have now, some benchmarks get 5-10%
> slower with rooting turned on. I looked more closely at two
> benchmarks, and in neither case was the slowdown due to duplicate
> rooters.
>
> So although we could adopt V8's handle system, it would not be a
> matter of just letting the compiler find places where handles are
> needed. We would need to add a lot of HandleScopes, and we would have
> to split apart some of our property access paths between "code that
> won't GC" and "code that might GC", with the idea that the latter code
> should be cold.
>
> ---- Comparison ----
>
> In theory, our system doesn't differ from V8's very much. Our handles
> are exactly like theirs. The systems differ in that V8 has
> HandleScope and we have Rooted<T>. Our Rooted<T> is basically just a
> HandleScope that always contains a single handle. V8's system is a
> little more powerful because it allows multiple handles to be created
> within a single scope.
>
> The only place where this difference seems to matter in when returning
> from functions. As long as you're not leaving a HandleScope, V8 can
> return a handle. We can never return handles, except in cases like
> JSContext::global() where the handle's storage is not on the stack.
>
> If there is a concern about how to root values returned from the API,
> then it would be easy to add a HandleScope notion to our
> engine. Functions in jsapi.cpp would return results by finding the
> nearest enclosing HandleScope, stashing the return value there, and
> returning a handle to it. Alternatively, Brian proposed that we could
> use outparams for everything at the API level. That would work too.
>
> However, I don't think that the return value issue is that big a
> deal. The more important thing is to have a simple rule, like "don't
> ever use bare JSObject* for stack variables". If we do that, and if we
> manage to eliminate most bare JSObject* uses, then the remaining ones
> will start to stick out very clearly and we can fix them easily.
>
> ---- Moving forward ----
>
> I think it's really important that we keep making progress. There
> seems to be some fear that we're heading in the wrong direction. In
> fact, I think that there's really only one direction we can go:
> towards more handles and more rooters.
>
> As Terrence mentioned earlier, I think a more pertinent question is
> safety versus performance. If we choose to use rooters for all stack
> variables and we continue to run the dynamic rooting analysis, then I
> think we'll have a bulletproof system that's safer and less buggy than
> V8 or the SpiderMonkey conservative scanning. However, that would
> almost certainly be too costly for performance.
>
> One thing I noticed about V8 that's nice is that all functions I saw
> either root everything, or else they're not allowed to GC. And when
> they're not allowed to GC, they assert that no GC happens. I think it
> would be great if SpiderMonkey could move to that model. That probably
> means we'll have to refactor some of our property access paths, sort
> of like V8, so that we have fast paths that don't GC and slow paths
> that root everything.
>
> However, I think those decisions should be guided by profiling. For
> now, I think our goal should be to convert as much stuff to handles
> and rooters as possible inside the JS engine. It's great that we have
> enough rooters to pass the dynamic rooting analysis, but I don't think
> we've passed the threshold for handle density where people feel like
> the compiler will catch their mistakes. I think that's where we need
> to be.
>
> -Bill
>
> On 06/27/2012 07:16 PM, Dave Mandelin wrote:
>> I still have some concerns about our approach for adapting to moving
>> GC. From the meeting notes, I see that there are some open concerns
>> about maintaining proper handle usage outside the JS engine (and I'm
>> not sure how easy it's going to be inside the engine for that matter)
>> and more worrying even, about knowing if we've actually gotten it
>> right, given all the cold paths.
>>
>> Today I was also thinking about compartments, which is a much simpler
>> API, but where we are still chasing out compartment bugs, and where
>> JSAPI users are still often doing it wrong.
>>
>> The last consideration is that I always like doing things the "dumb
>> way" if possible, and I still don't think I've gotten an answer as to
>> whether or not we can do this the dumb way. In my mind, the dumb way
>> goes something like this:
>>
>> - Take the V8 handle API. We know how it works, and we know it's been
>> used successfully. (Plus bonus points for aiming at API compat
>> someday, but that's not a primary consideration for what I'm talking
>> about now.)
>>
>> - Find the pinch points in the engine where we create objects, and
>> make them create handles and return the handles instead. Steve
>> pointed out to me that it's not quite as easy as that, because we
>> need to cover places where we just read gcthing pointers out of the
>> heap. Maybe we can set up all the heap pointers to be stored by
>> classes that don't implicitly convert to a raw pointer to cover that
>> part.
>>
>> - Fix all the compiler errors from the previous step by sticking
>> handle types where needed. Inspect functions as this is done and add
>> internal handle scopes as needed.
>>
>> - Add a few handle scopes to the shell.
>>
>> At this point we should have a JS shell that works with moving GC,
>> no? It's possible this is all that's required. The biggest risk I see
>> is that perf would be bad because of too many handles created
>> internally where not needed. I'm not exactly sure how that would play
>> out but I would imagine we'd try to find some simple patterns and
>> refactor the code a bit to make it fairly clear and maintainable.
>>
>> Maybe I'm missing something, but it seems like this would work, that
>> using the types and compiler checks would obviate most bugs, and it
>> would be a large bulk of work but not require anything heroically
>> complex.
>>
>> Would this work?
>>
>> Dave
>


_______________________________________________
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Re: [JS-internals] Approach for adapting SpiderMonkey to moving GC

Reply via email to