Re: [v8-dev] A potential speedup for Promises and Async Functions

Adam Klein Thu, 31 Aug 2017 18:35:39 -0700

On Thu, Aug 31, 2017 at 5:52 PM, Adam Klein <[email protected]> wrote:


> Hi Caitlin,
>
> Jakob and I just spent some time digging into this, comments inline
> (though we don't have answers to everything).
>
> On Thu, Aug 31, 2017 at 9:01 AM, <[email protected]> wrote:
>
>> Hi,
>>
>> Recently I've been trying out some things to get more out of Promises and
>> async functions.
>>
>> Different people in/around the Node.js project have been writing various
>> benchmarks which show cases where `await`
>> seems to slow things down significantly. One simple example is
>> https://github.com/tc39/proposal-async-iteration/issues/112#
>> issuecomment-324885954. While it's invalid to compare
>> the simple synchronous loop to the one with `await`, it does highlight
>> that in situations like that, the v8 implementation
>> can seem to be very slow, when really it should be more similar to the
>> sync. loop (~20 times slower seems like a steeper
>> price to pay than is necessary).
>>
>> I drafted an informal document to come up with some ideas for speeding up
>> Await in v8. In general, the solutions were
>> split into 2 categories:
>>
>> 1) reduce heap use and GC overhead (allocate fewer objects for Await).
>> 2) avoid JS->C and C->JS transitions where possible (mainly accomplished
>> by translating
>>     Isolate::RunMicrotasksInternal() and Isolate::EnqueueMicrotask()
>> into code stubs). This generally makes JS-defined
>>     microtasks (for Promises and Await) much faster, but may cause
>> DOM-defined microtasks to slow down a bit (unclear
>>     at this time). I expect Promises and Await to be used more frequently
>> in tight loops, and certainly DOM microtasks don't
>>     affect Node.js at all, so this may be something worth going after.
>>
>> The first approach did not make much of a dent in any benchmarks. More
>> useful profiles of actual applications did not
>> show `await` to be a bottleneck at all. Reducing overall memory use seems
>> like a good thing in general, however.
>>
>> The second approach yielded a significant improvement (~60% over 10 runs)
>> for the simple benchmark (in a very
>> simple prototype implementation with some limitations discussed below).
>>
>> So there are some constraints WRT implementing RunMicrotasks in JIT'd
>> code. Particularly, it needs to be possible to
>> perform RunMicrotasks() when no context has been entered. I've tried a
>> few things to work around this:
>>
>> Initially, I had wrote the stub with JS linkage, and used the typical
>> JSEntryStub to invoke it. This is partly
>> wasteful, and partly problematic. There need not be a separate JSFunction
>> for RunMicrotasks in each
>> context. More importantly, the function ought not to be associated with a
>> context at all, given the
>> constraint that it must be possible to invoke it without a context having
>> been entered.
>>
>
> From looking at the JSEntryStub codepath for JSFunctions, it appears to us
> that for a function marked as native, strict, or both (which seems
> appropriate in this case) there shouldn't be any need for a Context. So it
> seems like you could unblock your prototype by creating a single JSFunction
> (as a strong root on the Heap) which wraps the builtin, and call through
> that from the C++ API. If you already tried something like this and ran
> into trouble it'd be interesting to hear what went wrong.
>
>
>> A second approach involved creating new TF operators to initialize the
>> roots register (the main
>> manifestation of problems when not using the JSEntryStub was that the
>> roots register was not initialized,
>> leading to access violations when using heap constants). I didn't spend
>> much time with this, because I
>> felt that it was more important to make sure callee-saved registers were
>> restored properly, even though
>> there wasn't much going on in the sole caller of the function.  I thought
>> it might be interesting to produce
>> more general operators which would handle entry and exit for stubs which
>> need to be invoked from C,
>> but it seemed like a lot of work and I haven't gotten around to doing
>> this yet.
>>
>> Finally, I tried adding a new variant to JSEntryStub, which call the
>> RunMicrotasks stub rather than the various entry
>> trampolines. At this moment, it's mostly in working order, but it's
>> possible there are still problems with
>> StackFrameIteration and exception handling.
>>
>
> These approaches seem too involved just for this one case, I'd prefer the
> JSFunction approach above if it works.
>
>
>> Another limitation is, previously SaveContexts (which seem to matter to
>> the debugger and API in some way, though I
>> haven't really looked at why yet) were not set up when calling
>> API-defined microtask callbacks. In my prototype, I
>> always set up the SaveContext before entering the RunMicrotasks stub.
>> It's yet unclear if this breaks anything, or if it
>> would be possible (or even a good idea) to mimic the old behaviour in the
>> stub rather than always pushing the SaveContext.
>> This is a subtle difference, but as noted it could have some bad effects.
>>
>
> Still digging into this. It appears I may have inadvertently removed the
> regression test for this code in https://codereview.
> chromium.org/1909433003 when I removed support for Object.observe, but
> the regression test should be able to be adapted for Promises (see
> https://codereview.chromium.org/332923003 for the test). I'm going to try
> restoring the test and playing around with this code in the current C++
> version to see if I can get a better handle on it. But from an initial
> reading, it really shouldn't make a difference for the C++ callback case
> anyway.
>

Looking more deeply at this (and at
https://bugs.chromium.org/p/chromium/issues/detail?id=385349, the bug for
that regression test), I'm not convinced this is actually necessary
anymore. +yangguo in case he has a less fuzzy memory than me (this was all
>3 years ago).


> Finally, a somewhat strange behaviour of the stub is that it enters
>> contexts by itself when it needs to, inlining
>> HandleScopeImplementer::EnterMicrotaskContext and
>> LeaveMicrotaskContext(), and overwriting Isolate::context().
>> I believe this is done in a valid way in the prototype, but it's not
>> something that comes up in other stubs, so there isn't really
>> any other code to model it on.
>>
>
> This seems fine to me, it's rather special behavior even in its C++ form.
>
>
>> ---
>>
>> I was wondering if anyone thought reducing the C++->JS->C++ overhead in
>> RunMicrotasks for that 60% boost in certain
>> very simple and unrepresentative-of-real-code benchmarks might be worth
>> doing properly and upstreaming? While it's
>> unclear what the impact would be on real-world code, it seems like a
>> reasonable expectation that you'd see some kind of
>> significant benefit (though perhaps not on the order of 60% as in the
>> very simple benchmark mentioned above).
>>
>> If (in the opinion of the v8 team) it might be worth my time to try to
>> upstream this, I'd love some feedback on the approaches
>> taken to address the problems listed above, and get an idea of what sort
>> of approach you'd all be happiest with.
>>
>
> If we can pin down the answers to all the stuff above to our satisfaction,
> then yes, my inclination is that this is a worthwhile thing to do: the code
> may be a bit verbose (what with having to deal with the different sorts of
> things stored in the queue), but it's at least relatively straightforward.
>
> - Adam
>

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [v8-dev] A potential speedup for Promises and Async Functions

Reply via email to