> If there are only 25-35 allowable properties in a klass, you can > potentially make a really fast check for this.
Yep, I know. You are describing exactly what I described above, just in different words :-) It's an old and well known way to implement inheritance checks in a single inheritance languages (at least Oberon compilers used in back in 80s). -- Vyacheslav Egorov On Fri, Jul 20, 2012 at 12:41 AM, jMerliN <[email protected]> wrote: >> This will be great but there is no easy way to check that two hidden >> classes are compatible. Hidden classes are currently compared by >> pointer equivalence, which boils down to two instructions (compare and >> jump). Checking for inheritance would lead to a pretty complicated >> code. The most effecient way, it seems, to implement such a check is >> to record transition path in every map and then check if a fixed >> position in transition path is equal to a fixed map. This is much more >> complex and I am not sure it benefits any real world code. > > I'll try to find a good real-world example of where this causes > violent deops from common practices. I've seen it done quite a few > times. > > If there are only 25-35 allowable properties in a klass, you can > potentially make a really fast check for this. If you store pointers > to the klasses in a contiguous array such that higher indices are > always superklass pointers of lower indices (regardless of > transition), you can determine compatibility with 2 cmps (one compat, > one bounds checking). You could still do the normal cmp/jmp into > optimized code, but if the cmp fails (not equal), you can do 2 more > cmps (if > optimized-for-klass and < end of block) to determine if > this is a parent klass, and if so you can jmp to the optimized code > and only if those cmps fail do you deoptimize. > > The downside is that the generated optimized code would need to > dereference once just to get the klass pointer, adding an extra few > cycles to each optimized IC. Though I suppose when you could move > that code out and do actual klass pointer equiv cmp, if that fails > then go back to this block and do a bounds check, and if it's a parent > then jmp into the optimized code keeping the klass pointer, which > pushes the extra work into the case that the klass pointers aren't > equivalent but are compatible (which should be rare). Storing those > compat blocks would add a memory overhead and the non-monomorphic > check can potentially prevent a deoptimization with a few more > instructions. It shouldn't reduce performance, though. > > You could also potentially partition such a compat block structure as > to minimize the number of pointers needed to do a reasonable job at > guarding against deoptimization from extended objects. > > On Jul 19, 1:00 pm, Vyacheslav Egorov <[email protected]> wrote: >> Knowing that you are running it in node.js I can confirm that there is >> indeed a difference between test/test2 properties. The reason is we >> don't convert test to a CONSTANT_FUNCTION if object literal is not in >> global scope. This is a heuristic that was based on the assumption >> that top level code is executed once and non-top-level many times >> (thus every time object literal will have a different >> map):https://github.com/v8/v8/blob/master/src/parser.cc#L4272-4279. In the >> past we would not make test2 a CONSTANT_FUNCTION either because we >> required function to be in old space. I think we might want to change >> this to make it consistent and I've filed a bug (https:// >> code.google.com/p/v8/issues/detail?id=2246). node.js wraps module >> bodies in anonymous function --- that is why slow down is not >> reproable in Chrome or d8 shell: >> >> (function () { >> var z = {test: function () {}}; >> z.test2 = function () {}; >> function foo(z) { >> var i; >> console.time('test speed'); >> for (i = 0; i < 10000000; i++) z.test(); >> console.timeEnd('test speed'); >> console.time('test2 speed'); >> for (i = 0; i < 10000000; i++) z.test2(); >> console.timeEnd('test2 speed'); >> >> } >> >> foo(z); >> foo(z); >> >> })(); >> > The real issue in my example is that test is per- >> > object and runTest is static, if runTest was assigned via this., it >> > should only ever see one hidden class, unless you do something evil >> > like .apply. >> >> This will not help because type-feedback is currently shared between >> all instances of the same function literal: V8 mostly gets type- >> feedback from IC-stubs that are referenced by inline-caches in >> unoptimized code and unoptimized code object is the same for any >> closure created from the same function literal. >> >> > On a related note, has there been any consideration for making v8 not >> > de-optimize when a hidden class is ancestral to another (and therefore >> > compatible)? >> >> This will be great but there is no easy way to check that two hidden >> classes are compatible. Hidden classes are currently compared by >> pointer equivalence, which boils down to two instructions (compare and >> jump). Checking for inheritance would lead to a pretty complicated >> code. The most effecient way, it seems, to implement such a check is >> to record transition path in every map and then check if a fixed >> position in transition path is equal to a fixed map. This is much more >> complex and I am not sure it benefits any real world code. >> >> -- >> Vyacheslav Egorov >> >> On Jul 19, 8:03 pm, jMerliN <[email protected]> wrote: >> >> >> >> >> >> >> >> > Vyacheslav, >> >> > When I run the code you posted, I see a much bigger discrepancy >> > between test/test2 in the first pass and a slight reduction in test's >> > time but still a large discrepancy the second pass (indicating OSR >> > happened during the first loop the first time around), similar to what >> > I was seeing yesterday. But that's running on Node.js, and I haven't >> > re-built Node.js against the latest stable v8 code, but that issue is >> > completely gone in the current nightly Canary build. >> >> > I think I better understand the method issue now. V8 actually treats >> > methods set on this. differently than other properties, the assembly >> > generated looks aggressively inlined. If you cheat and set this.test >> > to a number then to the method, it effectively disables those >> > optimizations in V8 and you end up treating the object as a normal >> > object, and even though it doesn't cause deoptimizations (all objects >> > have the same hidden class), it's significantly slower than the >> > inlined method call. The real issue in my example is that test is per- >> > object and runTest is static, if runTest was assigned via this., it >> > should only ever see one hidden class, unless you do something evil >> > like .apply. >> >> > Though this test seems to indicate that this only occurs when building >> > the hidden class: http://pastebin.com/JbuLaEUt >> >> > Even though it never deoptimizes, I'd expect each of those to have >> > similar performance, but only the first Foobar created is performant. >> >> > On a related note, has there been any consideration for making v8 not >> > de-optimize when a hidden class is ancestral to another (and therefore >> > compatible)? I mean if you have {a: 7, b: 7} and you have a really >> > hot loop that only touches a and b, then you add a c property, because >> > it was transitioned from the proper hidden class for that hot loop to >> > a superclass of it (with the same indices in its property access >> > table), that hot function can assume it's the {a, b} hidden class. >> > This is similar to how classical inheritance works (Foo extends Bar, >> > functions that operate on Bar can also operate on Foo), but in this >> > case a hidden class transition is a strict superset, which lets you >> > make really nice assumptions. >> >> > On Jul 19, 2:27 am, Vyacheslav Egorov <[email protected]> wrote: >> >> > > Hi Justin, >> >> > > V8's hidden classes are not limited to tracking fields you assign to >> > > an object, V8 also tries to capture methods you assign (just like in >> > > any object-oriented language classes capture both data and behavior). >> >> > > That is why first and second objects produced by Foobar will have >> > > different hidden classes --- they have different methods. >> >> > > As to your second question: they are not treated differently. If you >> > > rewrite your test like this: >> >> > > var z = {test: function () {}}; >> > > z.test2 = function () {}; >> >> > > function foo(z) { >> > > var i; >> > > console.time('test speed'); >> > > for (i = 0; i < 10000000; i++) z.test(); >> > > console.timeEnd('test speed'); >> > > console.time('test2 speed'); >> > > for (i = 0; i < 10000000; i++) z.test2(); >> > > console.timeEnd('test2 speed'); >> >> > > } >> >> > > foo(z); >> > > foo(z); >> >> > > You will see something like: >> >> > > test speed: 38ms >> > > test2 speed: 12ms >> > > test speed: 11ms >> > > test2 speed: 11ms >> >> > > Truth is V8 optimizes the code while the first loop is still _running_ >> > > (this is called On Stack Replacement aka OSR). So first "test speed" >> > > measurement contains a sum of time spent in unoptimized code, compiler >> > > and optimized code and first "test2 speed" measurement is purely time >> > > spent in optimized code. If you call the same code second time you see >> > > purely timing results for optimized code. This is why benchmarks >> > > should always contain warm up phase to let optimizing JIT kick in. >> >> > > Hope this explains it. >> >> > > -- >> > > Vyacheslav Egorov >> >> > > On Thu, Jul 19, 2012 at 3:04 AM, jMerliN <[email protected]> wrote: >> > > > So I can't get my head around why this happens (I haven't dug through >> > > > v8's code to try to figure it out either), but this is really >> > > > inconsistent to me with how v8 constructs hidden classes in general. >> > > > The following is running in Node.js v0.8.2 (V8 v3.11.10.12). >> >> > > > Here's the code: >> > > >http://pastebin.com/2gKWrfHp >> >> > > > Here's the output, and the deopt trace: >> > > >http://pastebin.com/WerQuGLZ >> >> > > > Calling Foo.prototype.runTest with any Foo object results in similar >> > > > performance (unless you change the hidden class, as expected). Bar >> > > > expectedly deoptimizes because abc is stored on the proto and isn't >> > > > actually on the constructed object until the first call, causing the >> > > > optimized function (once it gets hot, which is after the object has >> > > > changed hidden class) to bailout on the next attempt with a new Bar >> > > > object. >> >> > > > It gets weird with Foobar. test is added directly to the object, the >> > > > only difference is that this is a function, not a primitive, but it >> > > > seems like the hidden classes of objects from Foobar's constructor >> > > > should be the same. The first run is performant, equivalent to Foo >> > > > (expected). Though running the test again with a new Foobar >> > > > deoptimizes it. I can't at all understand why. >> >> > > > Thanks, >> > > > Justin >> >> > > > -- >> > > > v8-users mailing list >> > > > [email protected] >> > > >http://groups.google.com/group/v8-users > > -- > v8-users mailing list > [email protected] > http://groups.google.com/group/v8-users -- v8-users mailing list [email protected] http://groups.google.com/group/v8-users
