Re: [v8-users] Re: De-optimization of hot function when constructor adds methods directly to object

Vyacheslav Egorov Fri, 20 Jul 2012 04:05:26 -0700

> If there are only 25-35 allowable properties in a klass, you can
> potentially make a really fast check for this.


Yep, I know. You are describing exactly what I described above, just
in different words :-) It's an old and well known way to implement
inheritance checks in a single inheritance languages (at least Oberon
compilers used in back in 80s).

--
Vyacheslav Egorov


On Fri, Jul 20, 2012 at 12:41 AM, jMerliN <[email protected]> wrote:
>> This will be great but there is no easy way to check that two hidden
>> classes are compatible. Hidden classes are currently compared by
>> pointer equivalence, which boils down to two instructions (compare and
>> jump). Checking for inheritance would lead to a pretty complicated
>> code. The most effecient way, it seems, to implement such a check is
>> to record transition path in every map and then check if a fixed
>> position in transition path is equal to a fixed map. This is much more
>> complex and I am not sure it benefits any real world code.
>
> I'll try to find a good real-world example of where this causes
> violent deops from common practices.  I've seen it done quite a few
> times.
>
> If there are only 25-35 allowable properties in a klass, you can
> potentially make a really fast check for this.  If you store pointers
> to the klasses in a contiguous array such that higher indices are
> always superklass pointers of lower indices (regardless of
> transition), you can determine compatibility with 2 cmps (one compat,
> one bounds checking).  You could still do the normal cmp/jmp into
> optimized code, but if the cmp fails (not equal), you can do 2 more
> cmps (if > optimized-for-klass and < end of block) to determine if
> this is a parent klass, and if so you can jmp to the optimized code
> and only if those cmps fail do you deoptimize.
>
> The downside is that the generated optimized code would need to
> dereference once just to get the klass pointer, adding an extra few
> cycles to each optimized IC.  Though I suppose when you could move
> that code out and do actual klass pointer equiv cmp, if that fails
> then go back to this block and do a bounds check, and if it's a parent
> then jmp into the optimized code keeping the klass pointer, which
> pushes the extra work into the case that the klass pointers aren't
> equivalent but are compatible (which should be rare).  Storing those
> compat blocks would add a memory overhead and the non-monomorphic
> check can potentially prevent a deoptimization with a few more
> instructions.  It shouldn't reduce performance, though.
>
> You could also potentially partition such a compat block structure as
> to minimize the number of pointers needed to do a reasonable job at
> guarding against deoptimization from extended objects.
>
> On Jul 19, 1:00 pm, Vyacheslav Egorov <[email protected]> wrote:
>> Knowing that you are running it in node.js I can confirm that there is
>> indeed a difference between test/test2 properties. The reason is we
>> don't convert test to a CONSTANT_FUNCTION if object literal is not in
>> global scope. This is a heuristic that was based on the assumption
>> that top level code is executed once and non-top-level many times
>> (thus every time object literal will have a different 
>> map):https://github.com/v8/v8/blob/master/src/parser.cc#L4272-4279. In the
>> past we would not make test2 a CONSTANT_FUNCTION either because we
>> required function to be in old space. I think we might want to change
>> this to make it consistent and I've filed a bug (https://
>> code.google.com/p/v8/issues/detail?id=2246). node.js wraps module
>> bodies in anonymous function --- that is why slow down is not
>> reproable in Chrome or d8 shell:
>>
>> (function () {
>> var z = {test: function () {}};
>> z.test2 = function () {};
>> function foo(z) {
>>   var i;
>>   console.time('test speed');
>>   for (i = 0; i < 10000000; i++) z.test();
>>   console.timeEnd('test speed');
>>   console.time('test2 speed');
>>   for (i = 0; i < 10000000; i++) z.test2();
>>   console.timeEnd('test2 speed');
>>
>> }
>>
>> foo(z);
>> foo(z);
>>
>> })();
>> > The real issue in my example is that test is per-
>> > object and runTest is static, if runTest was assigned via this., it
>> > should only ever see one hidden class, unless you do something evil
>> > like .apply.
>>
>> This will not help because type-feedback is currently shared between
>> all instances of the same function literal: V8 mostly gets type-
>> feedback from IC-stubs that are  referenced by inline-caches in
>> unoptimized code and unoptimized code object is the same for any
>> closure created from the same function literal.
>>
>> > On a related note, has there been any consideration for making v8 not
>> > de-optimize when a hidden class is ancestral to another (and therefore
>> > compatible)?
>>
>> This will be great but there is no easy way to check that two hidden
>> classes are compatible. Hidden classes are currently compared by
>> pointer equivalence, which boils down to two instructions (compare and
>> jump). Checking for inheritance would lead to a pretty complicated
>> code. The most effecient way, it seems, to implement such a check is
>> to record transition path in every map and then check if a fixed
>> position in transition path is equal to a fixed map. This is much more
>> complex and I am not sure it benefits any real world code.
>>
>> --
>> Vyacheslav Egorov
>>
>> On Jul 19, 8:03 pm, jMerliN <[email protected]> wrote:
>>
>>
>>
>>
>>
>>
>>
>> > Vyacheslav,
>>
>> > When I run the code you posted, I see a much bigger discrepancy
>> > between test/test2 in the first pass and a slight reduction in test's
>> > time but still a large discrepancy the second pass (indicating OSR
>> > happened during the first loop the first time around), similar to what
>> > I was seeing yesterday.  But that's running on Node.js, and I haven't
>> > re-built Node.js against the latest stable v8 code, but that issue is
>> > completely gone in the current nightly Canary build.
>>
>> > I think I better understand the method issue now.  V8 actually treats
>> > methods set on this. differently than other properties, the assembly
>> > generated looks aggressively inlined.  If you cheat and set this.test
>> > to a number then to the method, it effectively disables those
>> > optimizations in V8 and you end up treating the object as a normal
>> > object, and even though it doesn't cause deoptimizations (all objects
>> > have the same hidden class), it's significantly slower than the
>> > inlined method call.  The real issue in my example is that test is per-
>> > object and runTest is static, if runTest was assigned via this., it
>> > should only ever see one hidden class, unless you do something evil
>> > like .apply.
>>
>> > Though this test seems to indicate that this only occurs when building
>> > the hidden class:  http://pastebin.com/JbuLaEUt
>>
>> > Even though it never deoptimizes, I'd expect each of those to have
>> > similar performance, but only the first Foobar created is performant.
>>
>> > On a related note, has there been any consideration for making v8 not
>> > de-optimize when a hidden class is ancestral to another (and therefore
>> > compatible)?  I mean if you have {a: 7, b: 7} and you have a really
>> > hot loop that only touches a and b, then you add a c property, because
>> > it was transitioned from the proper hidden class for that hot loop to
>> > a superclass of it (with the same indices in its property access
>> > table), that hot function can assume it's the {a, b} hidden class.
>> > This is similar to how classical inheritance works (Foo extends Bar,
>> > functions that operate on Bar can also operate on Foo), but in this
>> > case a hidden class transition is a strict superset, which lets you
>> > make really nice assumptions.
>>
>> > On Jul 19, 2:27 am, Vyacheslav Egorov <[email protected]> wrote:
>>
>> > > Hi Justin,
>>
>> > > V8's hidden classes are not limited to tracking fields you assign to
>> > > an object, V8 also tries to capture methods you assign (just like in
>> > > any object-oriented language classes capture both data and behavior).
>>
>> > > That is why first and second objects produced by Foobar will have
>> > > different hidden classes --- they have different methods.
>>
>> > > As to your second question: they are not treated differently. If you
>> > > rewrite your test like this:
>>
>> > > var z = {test: function () {}};
>> > > z.test2 = function () {};
>>
>> > > function foo(z) {
>> > >   var i;
>> > >   console.time('test speed');
>> > >   for (i = 0; i < 10000000; i++) z.test();
>> > >   console.timeEnd('test speed');
>> > >   console.time('test2 speed');
>> > >   for (i = 0; i < 10000000; i++) z.test2();
>> > >   console.timeEnd('test2 speed');
>>
>> > > }
>>
>> > > foo(z);
>> > > foo(z);
>>
>> > > You will see something like:
>>
>> > > test speed: 38ms
>> > > test2 speed: 12ms
>> > > test speed: 11ms
>> > > test2 speed: 11ms
>>
>> > > Truth is V8 optimizes the code while the first loop is still _running_
>> > > (this is called On Stack Replacement aka OSR). So first "test speed"
>> > > measurement contains a sum of time spent in unoptimized code, compiler
>> > > and optimized code and first "test2 speed" measurement is purely time
>> > > spent in optimized code. If you call the same code second time you see
>> > > purely timing results for optimized code. This is why benchmarks
>> > > should always contain warm up phase to let optimizing JIT kick in.
>>
>> > > Hope this explains it.
>>
>> > > --
>> > > Vyacheslav Egorov
>>
>> > > On Thu, Jul 19, 2012 at 3:04 AM, jMerliN <[email protected]> wrote:
>> > > > So I can't get my head around why this happens (I haven't dug through
>> > > > v8's code to try to figure it out either), but this is really
>> > > > inconsistent to me with how v8 constructs hidden classes in general.
>> > > > The following is running in Node.js v0.8.2 (V8 v3.11.10.12).
>>
>> > > > Here's the code:
>> > > >http://pastebin.com/2gKWrfHp
>>
>> > > > Here's the output, and the deopt trace:
>> > > >http://pastebin.com/WerQuGLZ
>>
>> > > > Calling Foo.prototype.runTest with any Foo object results in similar
>> > > > performance (unless you change the hidden class, as expected).  Bar
>> > > > expectedly deoptimizes because abc is stored on the proto and isn't
>> > > > actually on the constructed object until the first call, causing the
>> > > > optimized function (once it gets hot, which is after the object has
>> > > > changed hidden class) to bailout on the next attempt with a new Bar
>> > > > object.
>>
>> > > > It gets weird with Foobar.  test is added directly to the object, the
>> > > > only difference is that this is a function, not a primitive, but it
>> > > > seems like the hidden classes of objects from Foobar's constructor
>> > > > should be the same.  The first run is performant, equivalent to Foo
>> > > > (expected).  Though running the test again with a new Foobar
>> > > > deoptimizes it.  I can't at all understand why.
>>
>> > > > Thanks,
>> > > > Justin
>>
>> > > > --
>> > > > v8-users mailing list
>> > > > [email protected]
>> > > >http://groups.google.com/group/v8-users
>
> --
> v8-users mailing list
> [email protected]
> http://groups.google.com/group/v8-users

-- 
v8-users mailing list
[email protected]
http://groups.google.com/group/v8-users

Re: [v8-users] Re: De-optimization of hot function when constructor adds methods directly to object

Reply via email to