Re: [JS-internals] Approach for adapting SpiderMonkey to moving GC

Dave Mandelin Fri, 13 Jul 2012 14:50:15 -0700

On Friday, July 13, 2012 1:41:00 PM UTC-7, Dave Mandelin wrote:
> Bill sent this out, but it seems not to have reached the newsgroup, so 
> pasting it here. (By the way, to avoid such things, I just post through 
> groups.google.com.)
> 
> So Bill said:
> 
> I didn&#39;t respond to Dave&#39;s email right away because I wasn&#39;t sure 
> how
> I felt, and I had a lot of questions about how rooting works in
> V8. After looking at it more closely, here&#39;s what I think:


Thanks for the detailed analysis. This is exactly the kind of discussion I was 
hoping to provoke.

> 1. I don&#39;t think that adopting V8&#39;s handle system is a silver bullet.

But right here I have to say, hmph--no one ever said it was, so I really don't 
see what point you're trying to make with that statement.

> 2. I think there are multiple dimensions that we can use to compare
> V8&#39;s way of rooting with SpiderMonkey&#39;s way of rooting. I don&#39;t 
> think
> either way is strictly better.

I think that is also widely agreed on.

> 3. I think we&#39;re generally moving in the right direction, and the most
> important thing is to move forward. I think a lot of the current fear
> about the rooters stems from the fact that they&#39;re incomplete, and
> that makes the whole system look more error-prone than it is.

That doesn't really sound like what I think are my major concerns, although 
'error-prone' does get at maintainability, which is the biggest one.

> I&#39;ll address these points in turn.
> 
> ---- V8&#39;s handle system ----
> 
> If we wanted to use V8-style handles in SpiderMonkey, we would have to
> add a lot of HandleScopes throughout the JS engine. When you create a
> handle, it takes space in the nearest enclosing HandleScope until that
> HandleScope is destroyed. So any place we have an unbounded loop in
> the JS engine, we would have to make a new HandleScope to ensure that
> we didn&#39;t use an unbounded amount of memory storing handles. This
> seems annoying and error-prone. Especially since each time you add a
> HandleScope you have to reason about what handles might escape it.

Interesting.

> The other issue is performance. As far as I can tell, V8 doesn&#39;t use
> handles on any of their property access paths. They use handles during
> code generation, and in some runtime functions. Everywhere else they
> use bare pointers. In some places they assert that no allocation
> happens. In other places, they use a complicated scheme of GCing and
> then retrying an operation if an allocation fails.

Last year Erik Corry told me that they used bare pointers up front because they 
thought they needed to for perf, but that he later concluded they probably 
could have used handles more.

> In SpiderMonkey, the separation between code that can GC and code that
> can&#39;t isn&#39;t nearly as clear as it is in V8. If we adopted handles
> without doing anything else, we would have to add handles to a lot
> more places than V8 does, and our performance would certainly suffer.
> 
> I did some benchmarking of our current rooters. As far as I can tell,
> almost all the cost of them is on stubs paths like GetElem and
> GetProperty. V8 doesn&#39;t use handles at all in these places. Even with
> the limited set of rooters we have now, some benchmarks get 5-10%
> slower with rooting turned on. I looked more closely at two
> benchmarks, and in neither case was the slowdown due to duplicate
> rooters.

This is a good measurement--(assuming it's accurate) it shows that in some 
places at least you need to avoid creating handles. Stub calls do seem to be a 
key place we really wouldn't want to use them.

> So although we could adopt V8&#39;s handle system, it would not be a
> matter of just letting the compiler find places where handles are
> needed. We would need to add a lot of HandleScopes, and we would have
> to split apart some of our property access paths between &quot;code that
> won&#39;t GC&quot; and &quot;code that might GC&quot;, with the idea that the 
> latter code
> should be cold.

Got it. Thanks. This is a good explanation of how "the dumb way" is not as dumb 
as it might seem.

> ---- Comparison ----
> 
> In theory, our system doesn&#39;t differ from V8&#39;s very much. Our handles
> are exactly like theirs. The systems differ in that V8 has
> HandleScope and we have Rooted&lt;T&gt;. Our Rooted&lt;T&gt; is basically 
> just a
> HandleScope that always contains a single handle. V8&#39;s system is a
> little more powerful because it allows multiple handles to be created
> within a single scope.
> 
> The only place where this difference seems to matter in when returning
> from functions. As long as you&#39;re not leaving a HandleScope, V8 can
> return a handle. We can never return handles, except in cases like
> JSContext::global() where the handle&#39;s storage is not on the stack.

I think "in theory" is the key word in the discussion above. The questions at 
the API level are all about usability--how easy is it to understand the API and 
use it correctly, how much help do tools give you with that, and how convenient 
is it to use the system in general.

> If there is a concern about how to root values returned from the API,
> then it would be easy to add a HandleScope notion to our
> engine. Functions in jsapi.cpp would return results by finding the
> nearest enclosing HandleScope, stashing the return value there, and
> returning a handle to it. 

IIUC, you are talking about using the V8 handle API for the public API, but 
doing completely different things inside the engine. I like that idea. From 
your discussion above, I take it that getting rooting right inside the engine 
just isn't that easy and requires thought, and that no simple model makes it 
easy. The same would be true for some API users, but in some cases things would 
be very simple and well served by an API that doesn't return unsafe pointers at 
all.

It also seems worth considering just adding new APIs for things that work that 
way, taking small steps toward a C++ API.

> Alternatively, Brian proposed that we could
> use outparams for everything at the API level. That would work too.

For the current API, that actually seems pretty good, since most things are 
already outparams anyway. Outparams are kind of yucky for usability, though, so 
I wouldn't want to go that way for future APIs.

> However, I don&#39;t think that the return value issue is that big a
> deal. The more important thing is to have a simple rule, like &quot;don&#39;t
> ever use bare JSObject* for stack variables&quot;. 

You can't have a rule that simple, though. The real rules have to teach people 
about roots and handles and the safe relations between their lifetimes. And, we 
all know that no matter how simple the rule, it is easy to accidentally break 
it, so safeguards are still useful.

> If we do that, and if we
> manage to eliminate most bare JSObject* uses, then the remaining ones
> will start to stick out very clearly and we can fix them easily.
> 
> ---- Moving forward ----
> 
> I think it&#39;s really important that we keep making progress. There
> seems to be some fear that we&#39;re heading in the wrong direction. In
> fact, I think that there&#39;s really only one direction we can go:
> towards more handles and more rooters.

That's too strong a statement. 

> As Terrence mentioned earlier, I think a more pertinent question is
> safety versus performance. If we choose to use rooters for all stack
> variables and we continue to run the dynamic rooting analysis, then I
> think we&#39;ll have a bulletproof system that&#39;s safer and less buggy than
> V8 or the SpiderMonkey conservative scanning. However, that would
> almost certainly be too costly for performance.

IIUC, it would not be bulletproof unless you ran the dynamic analysis on a test 
suite with total coverage.

> One thing I noticed about V8 that&#39;s nice is that all functions I saw
> either root everything, or else they&#39;re not allowed to GC. And when
> they&#39;re not allowed to GC, they assert that no GC happens. I think it
> would be great if SpiderMonkey could move to that model. That probably
> means we&#39;ll have to refactor some of our property access paths, sort
> of like V8, so that we have fast paths that don&#39;t GC and slow paths
> that root everything.
> 
> However, I think those decisions should be guided by profiling. For
> now, I think our goal should be to convert as much stuff to handles
> and rooters as possible inside the JS engine. It&#39;s great that we have
> enough rooters to pass the dynamic rooting analysis, but I don&#39;t think
> we&#39;ve passed the threshold for handle density where people feel like
> the compiler will catch their mistakes. I think that&#39;s where we need
> to be.

I don't understand the point about handle density.

On the get-the-engine working side of things, I think you provided some good 
data, and some good reasons why just slapping handles everywhere doesn't finish 
the job, and "slap handles everywhere + fix" is not likely to go better than 
continuing as you are.

(Still inside the engine), I don't think the maintainability issues have been 
fully worked out yet. Testing is of course good, but it's nicer to have things 
easier to get right in the first place vs. debugging lots of test failures all 
the time. I don't see this as a huge problem, but something to pay attention to 
over time.

At the API boundary, I don't know. I still really don't like the idea of an API 
that returns unsafe pointers with only a note in some documentation somewhere 
that warns you not to use them. About roots vs. scopes, I guess it's hard to 
say without having experience writing client code using the different APIs. I 
do know that the V8 API is successful with embedders, although I imagine that 
some aspects of handle scopes are a common source of problems with the API. I 
also really like the idea of moving toward API compatibility in the long term; 
I don't expect the world will really want two mutually incompatible JS 
embedding APIs.

Using outparams to protect the user from raw pointers in the current JSAPI 
seems like a reasonable step for the short term.

If you want to put the current roots and handles in the API, even for the short 
term, I do really want solid documentation, with easy to understand rules and a 
tutorial (like V8 has; as I've said before, if this can be produced, it would 
be evidence that the API is usable)

And I don't want to bikeshed on stuff if it is not important, but 'Rooted<T>' 
still sounds weird to me. It sounds like a name that would be used in a 
parameter to indicate that something is rooted (but we call that Handle, which 
is fine, and better). 'Root<T>' is better, but something that is more 
suggestive of how to use the thing than a GC implementation term would be 
better yet. Maybe GCPtr<T>? MovablePtr<T>?

Dave
_______________________________________________
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Re: [JS-internals] Approach for adapting SpiderMonkey to moving GC

Reply via email to