That's an excellent answer, Seth.

On Fri, May 20, 2022 at 10:07 PM '[email protected]' via v8-dev <
[email protected]> wrote:

> Hi Conrad,
>
> I'll make an attempt at answering, though I'm not an expert on OSR, so
> others like Jakob or Mythri may have more precise answers.
>
> 1. Why would this property access be polymorphic?
>
> If you're talking about polymorphic access, you're probably familiar with
> hidden classes (also called "object shapes" or "Maps but not *those* Maps
> <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map>").
> Regardless, I'll include a link: the most complete and accurate description
> I've found of how hidden classes work in V8 is Fast properties in V8
> <https://v8.dev/blog/fast-properties>.
>
> In this example, the object {a: 0} has a hidden class which says it has
> one property named "a". The object {a: 0, b:1} has a different hidden class
> which says it has two properties, named "a" and "b". After the function
> getA has performed a lookup to get property "a" from {a: 0}, it keeps a
> pointer to that object's hidden class and another value indicating where
> the property can be found (for this case, the first in-object property
> slot). When running getA({a:0, b:1}), V8 checks whether the object {a:0,
> b:1} has the same hidden class as {a: 0}, which it does not. So V8
> remembers a second pair of values: the hidden class for {a:0, b:1} and
> where its "a" property can be found (also the first in-object property
> slot). Now the feedback state for that load operation is polymorphic. The
> "mono" in monomorphic refers to a single hidden class, not to a single
> result of where the property can be found.
>
> This behavior tends to be particularly problematic in codebases with a lot
> of class inheritance, because loading a field defined by a base class is
> often a megamorphic operation, even if that base class's constructor always
> sets the same properties in the same order.
>
> 2. Why would polymorphic code optimized by turbofan be a full 3x slower
> than unoptimized bytecode?
>
> It seems that you may be misunderstanding the somewhat cryptic output from
> --trace-opt. In particular, OSR means on-stack replacement
> <https://v8.dev/blog/v8-release-79#osr-caching>. Copying some text from
> that link:
>
> *When V8 identifies that certain functions are hot it marks them for
> optimization on the next call. When the function executes again, V8
> compiles the function using the optimizing compiler and starts using the
> optimized code from the subsequent call. However, for functions with long
> running loops this is not sufficient. V8 uses a technique called on-stack
> replacement (OSR) to install optimized code for the currently executing
> function. This allows us to start using the optimized code during the first
> execution of the function, while it is stuck in a hot loop.*
>
> Iterating through an array of 100 million items certainly counts as a "hot
> loop", so the vast majority of the time in *all* of your measurements is
> spent in optimized code produced by Turbofan, not in the interpreter. You
> can try running unoptimized code by passing the command-line flag --no-opt,
> which I expect will go much more slowly than what you've measured thus far.
> I've added some possibly more human-readable annotations to the output you
> provided:
>
> # This call started in the interpreter, but was replaced by optimized code
> while running (this process is referred to as OSR).
> [compiling method 0x3b21df5b9ad9 <JSFunction sum (sfi = 0x10c4d4712831)>
> (target TURBOFAN) using TurboFan OSR]
> [optimizing 0x3b21df5b9ad9 <JSFunction sum (sfi = 0x10c4d4712831)> (target
> TURBOFAN) - took 0.000, 0.541, 0.000 ms]
> array1: 115.701ms
>
> # This call reused the OSR code from the first call.
> [found optimized code for 0x3b21df5b9ad9 <JSFunction sum (sfi =
> 0x10c4d4712831)> (target TURBOFAN) at OSR bytecode offset 35]
> array1: 113.721ms
>
> # At this point, the function got compiled normally (not using OSR), so
> future calls will use this optimized code.
> [compiling method 0x3b21df5b9ad9 <JSFunction sum (sfi = 0x10c4d4712831)>
> (target TURBOFAN) using TurboFan]
> [optimizing 0x3b21df5b9ad9 <JSFunction sum (sfi = 0x10c4d4712831)> (target
> TURBOFAN) - took 0.000, 0.500, 0.041 ms]
>
> # These three calls used fully optimized code.
> array1: 80.069ms
> array1: 79.72ms
> array1: 79.245ms
>
> # This call mostly used optimized code, until it bailed out to the
> interpreter for the last four items.
> array2: 78.906ms
>
> # This call reused the OSR code from the very first call. This is
> surprising to me; I didn't realize that the OSR code was still available at
> this point, after the non-OSR version of the function has bailed out.
> However, it seems to work nicely in this case. Once again, it bailed out to
> the interpreter for the last four items.
> [found optimized code for 0x3b21df5b9ad9 <JSFunction sum (sfi =
> 0x10c4d4712831)> (target TURBOFAN) at OSR bytecode offset 35]
> array2: 112.758ms
>
> # At this point, the function got compiled normally again, so future calls
> will use this optimized code.
> [compiling method 0x3b21df5b9ad9 <JSFunction sum (sfi = 0x10c4d4712831)>
> (target TURBOFAN) using TurboFan]
> [optimizing 0x3b21df5b9ad9 <JSFunction sum (sfi = 0x10c4d4712831)> (target
> TURBOFAN) - took 0.000, 0.500, 0.042 ms]
>
> # These three calls used that newly compiled version of the code, which
> uses a megamorphic load.
> array2: 350.273ms
> array2: 351.822ms
> array2: 357.311ms
>
> In closing, I'll just echo Ryan: "JS perf is extremely hard to reason
> about".
>
> Best,
> Seth
> On Friday, May 20, 2022 at 10:11:37 AM UTC-7 [email protected] wrote:
>
>> I'm looking at a perf example shared by Ryan Cavanaugh of Typescript, and
>> I'm very much failing to understand what is happening and why. The
>> particular contradictions upset my entire mental model of how to write
>> performant javascript. What's going on internally?
>>
>> Here is the example:
>> https://gist.github.com/conartist6/642dcfbd6fa444da92f211bcb405692b
>>
>> The two specific things I don't understand are:
>>
>> 1. If I have this code:
>>
>> ```js
>> function getA(o) {
>>   // Why would this property access be polymorphic?
>>   // Isn't the offset for the `a` property always the same?
>>   return o.a;
>> }
>> getA({ a: 0 })
>> getA({ a: 0, b: 1 })
>> ```
>>
>> 2. Why would polymorphic code optimized by turbofan be a full 3x slower
>> than unoptimized bytecode?
>>
>> --
> --
> v8-dev mailing list
> [email protected]
> http://groups.google.com/group/v8-dev
> ---
> You received this message because you are subscribed to the Google Groups
> "v8-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/v8-dev/6857a76a-8c97-445d-9f92-413cc9f99c67n%40googlegroups.com
> <https://groups.google.com/d/msgid/v8-dev/6857a76a-8c97-445d-9f92-413cc9f99c67n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-dev/CAKSzg3Qbwm91nxVZ5YVSXYaL%3D-hoQCmm%2BH6sg4jGQhv_5VCdyQ%40mail.gmail.com.

Reply via email to