On Friday, July 13, 2012 1:41:00 PM UTC-7, Dave Mandelin wrote: > Bill sent this out, but it seems not to have reached the newsgroup, so > pasting it here. (By the way, to avoid such things, I just post through > groups.google.com.) > > So Bill said: > > I didn't respond to Dave's email right away because I wasn't sure > how > I felt, and I had a lot of questions about how rooting works in > V8. After looking at it more closely, here's what I think:
Thanks for the detailed analysis. This is exactly the kind of discussion I was hoping to provoke. > 1. I don't think that adopting V8's handle system is a silver bullet. But right here I have to say, hmph--no one ever said it was, so I really don't see what point you're trying to make with that statement. > 2. I think there are multiple dimensions that we can use to compare > V8's way of rooting with SpiderMonkey's way of rooting. I don't > think > either way is strictly better. I think that is also widely agreed on. > 3. I think we're generally moving in the right direction, and the most > important thing is to move forward. I think a lot of the current fear > about the rooters stems from the fact that they're incomplete, and > that makes the whole system look more error-prone than it is. That doesn't really sound like what I think are my major concerns, although 'error-prone' does get at maintainability, which is the biggest one. > I'll address these points in turn. > > ---- V8's handle system ---- > > If we wanted to use V8-style handles in SpiderMonkey, we would have to > add a lot of HandleScopes throughout the JS engine. When you create a > handle, it takes space in the nearest enclosing HandleScope until that > HandleScope is destroyed. So any place we have an unbounded loop in > the JS engine, we would have to make a new HandleScope to ensure that > we didn't use an unbounded amount of memory storing handles. This > seems annoying and error-prone. Especially since each time you add a > HandleScope you have to reason about what handles might escape it. Interesting. > The other issue is performance. As far as I can tell, V8 doesn't use > handles on any of their property access paths. They use handles during > code generation, and in some runtime functions. Everywhere else they > use bare pointers. In some places they assert that no allocation > happens. In other places, they use a complicated scheme of GCing and > then retrying an operation if an allocation fails. Last year Erik Corry told me that they used bare pointers up front because they thought they needed to for perf, but that he later concluded they probably could have used handles more. > In SpiderMonkey, the separation between code that can GC and code that > can't isn't nearly as clear as it is in V8. If we adopted handles > without doing anything else, we would have to add handles to a lot > more places than V8 does, and our performance would certainly suffer. > > I did some benchmarking of our current rooters. As far as I can tell, > almost all the cost of them is on stubs paths like GetElem and > GetProperty. V8 doesn't use handles at all in these places. Even with > the limited set of rooters we have now, some benchmarks get 5-10% > slower with rooting turned on. I looked more closely at two > benchmarks, and in neither case was the slowdown due to duplicate > rooters. This is a good measurement--(assuming it's accurate) it shows that in some places at least you need to avoid creating handles. Stub calls do seem to be a key place we really wouldn't want to use them. > So although we could adopt V8's handle system, it would not be a > matter of just letting the compiler find places where handles are > needed. We would need to add a lot of HandleScopes, and we would have > to split apart some of our property access paths between "code that > won't GC" and "code that might GC", with the idea that the > latter code > should be cold. Got it. Thanks. This is a good explanation of how "the dumb way" is not as dumb as it might seem. > ---- Comparison ---- > > In theory, our system doesn't differ from V8's very much. Our handles > are exactly like theirs. The systems differ in that V8 has > HandleScope and we have Rooted<T>. Our Rooted<T> is basically > just a > HandleScope that always contains a single handle. V8's system is a > little more powerful because it allows multiple handles to be created > within a single scope. > > The only place where this difference seems to matter in when returning > from functions. As long as you're not leaving a HandleScope, V8 can > return a handle. We can never return handles, except in cases like > JSContext::global() where the handle's storage is not on the stack. I think "in theory" is the key word in the discussion above. The questions at the API level are all about usability--how easy is it to understand the API and use it correctly, how much help do tools give you with that, and how convenient is it to use the system in general. > If there is a concern about how to root values returned from the API, > then it would be easy to add a HandleScope notion to our > engine. Functions in jsapi.cpp would return results by finding the > nearest enclosing HandleScope, stashing the return value there, and > returning a handle to it. IIUC, you are talking about using the V8 handle API for the public API, but doing completely different things inside the engine. I like that idea. From your discussion above, I take it that getting rooting right inside the engine just isn't that easy and requires thought, and that no simple model makes it easy. The same would be true for some API users, but in some cases things would be very simple and well served by an API that doesn't return unsafe pointers at all. It also seems worth considering just adding new APIs for things that work that way, taking small steps toward a C++ API. > Alternatively, Brian proposed that we could > use outparams for everything at the API level. That would work too. For the current API, that actually seems pretty good, since most things are already outparams anyway. Outparams are kind of yucky for usability, though, so I wouldn't want to go that way for future APIs. > However, I don't think that the return value issue is that big a > deal. The more important thing is to have a simple rule, like "don't > ever use bare JSObject* for stack variables". You can't have a rule that simple, though. The real rules have to teach people about roots and handles and the safe relations between their lifetimes. And, we all know that no matter how simple the rule, it is easy to accidentally break it, so safeguards are still useful. > If we do that, and if we > manage to eliminate most bare JSObject* uses, then the remaining ones > will start to stick out very clearly and we can fix them easily. > > ---- Moving forward ---- > > I think it's really important that we keep making progress. There > seems to be some fear that we're heading in the wrong direction. In > fact, I think that there's really only one direction we can go: > towards more handles and more rooters. That's too strong a statement. > As Terrence mentioned earlier, I think a more pertinent question is > safety versus performance. If we choose to use rooters for all stack > variables and we continue to run the dynamic rooting analysis, then I > think we'll have a bulletproof system that's safer and less buggy than > V8 or the SpiderMonkey conservative scanning. However, that would > almost certainly be too costly for performance. IIUC, it would not be bulletproof unless you ran the dynamic analysis on a test suite with total coverage. > One thing I noticed about V8 that's nice is that all functions I saw > either root everything, or else they're not allowed to GC. And when > they're not allowed to GC, they assert that no GC happens. I think it > would be great if SpiderMonkey could move to that model. That probably > means we'll have to refactor some of our property access paths, sort > of like V8, so that we have fast paths that don't GC and slow paths > that root everything. > > However, I think those decisions should be guided by profiling. For > now, I think our goal should be to convert as much stuff to handles > and rooters as possible inside the JS engine. It's great that we have > enough rooters to pass the dynamic rooting analysis, but I don't think > we've passed the threshold for handle density where people feel like > the compiler will catch their mistakes. I think that's where we need > to be. I don't understand the point about handle density. On the get-the-engine working side of things, I think you provided some good data, and some good reasons why just slapping handles everywhere doesn't finish the job, and "slap handles everywhere + fix" is not likely to go better than continuing as you are. (Still inside the engine), I don't think the maintainability issues have been fully worked out yet. Testing is of course good, but it's nicer to have things easier to get right in the first place vs. debugging lots of test failures all the time. I don't see this as a huge problem, but something to pay attention to over time. At the API boundary, I don't know. I still really don't like the idea of an API that returns unsafe pointers with only a note in some documentation somewhere that warns you not to use them. About roots vs. scopes, I guess it's hard to say without having experience writing client code using the different APIs. I do know that the V8 API is successful with embedders, although I imagine that some aspects of handle scopes are a common source of problems with the API. I also really like the idea of moving toward API compatibility in the long term; I don't expect the world will really want two mutually incompatible JS embedding APIs. Using outparams to protect the user from raw pointers in the current JSAPI seems like a reasonable step for the short term. If you want to put the current roots and handles in the API, even for the short term, I do really want solid documentation, with easy to understand rules and a tutorial (like V8 has; as I've said before, if this can be produced, it would be evidence that the API is usable) And I don't want to bikeshed on stuff if it is not important, but 'Rooted<T>' still sounds weird to me. It sounds like a name that would be used in a parameter to indicate that something is rooted (but we call that Handle, which is fine, and better). 'Root<T>' is better, but something that is more suggestive of how to use the thing than a GC implementation term would be better yet. Maybe GCPtr<T>? MovablePtr<T>? Dave _______________________________________________ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals