"extends" is not an anti-pattern.

Of course monomorphic code is fastest. Polymorphic/megamorphic loads and
stores have to do more work (specifically, dynamic dispatch), which is
going to take a bit more time. Class hierarchies are one way how developers
*can* create polymorphic code; what you have here is an example of that.
But that doesn't mean that inheritance itself is a problem, or that
polymorphism is a problem.

The reason this is different from C++ vtables is because JavaScript isn't
C++. JavaScript is a dynamic language, and engines must strike a balance
between achieving fast execution when the object hierarchies are not
modified much, and still having acceptable performance when developers make
use of JavaScript's dynamic capabilities and do modify the object hierarchy
at runtime. So engines should not have internal mechanics that are too
rigid, and would cause huge performance drops when code does something
valid but "unexpected". It's easy to point at one pattern and say "clearly
this should be handled better", but when you look at a variety of large
codebases, they use so many different patterns that it becomes very unclear
what internal object representation model would work best on average.

FWIW, the original example is essentially equivalent to simple
"traditional" polymorphic patterns like the following, which uses no
inheritance and no "extends":

function DoIt(obj) { obj.done++; }

function Cat() {}
function Dog() {}
// etc.

var c = new Cat();
DoIt(c);  // Only one type so far -> monomorphic
var d = new Dog();
DoIt(d);  // Two types -> polymorphic
// etc.

On Sun, Sep 24, 2017 at 3:51 PM, Caitlin Potter <ca...@igalia.com> wrote:

> So yes, the load/store for `this.done++` can't be reduced to simple
> machine ops when `doIt` is megamorphic, as far as I can tell. (5+ receiver
> maps should make the load/stores megamorphic, or at least that's what
> https://github.com/v8/v8/blob/9a0d5d9700ee269cbe7abee6d62733
> 163dadac14/src/ic/ic.h#L33-L35 seems to indicate).
>
> That doesn't necessarily mean LoadIC is going to be slow, I guess it
> depends on how often you the receiver map is missing from the cache. Slower
> than the inlined monomorphic in-object field load/store, but probably not
> "anti-pattern" slow.
>
> I could be wrong about this, but I think the CallIC for individual calls
> to `doIt` can be inlined with different type feedback information for the
> load/store, so if that shows up in hot code which is monomorphic or
> polymorphic, it might produce better code than your example there.
>
> That said, I'm half guessing about all of this, so I'll leave the rest of
> this thread for more knowledgeable people.
>
> On Sunday, September 24, 2017 at 12:07:58 PM UTC-4, Bogdan Orlov wrote:
>>
>> After removing all intermediate %OptimizeFunctionOnNextCall()  and
>> putting at the end (after usage of all Animal subclasses for enough
>> feedback) like this
>>
>> class Animal {
>>   constructor(){
>>     this.done = 0
>>   }
>>   doIt(){
>>     this.done++
>>   }
>> }
>> class Cat extends Animal {}
>> class Dog extends Animal {}
>> class Dog1 extends Animal {}
>> class Dog2 extends Animal {}
>> class Dog3 extends Animal {}
>>
>> function test(){
>>   var cat = new Cat();
>>   cat.doIt();
>>   cat.doIt();
>>   cat.doIt();
>>   var dog = new Dog();
>>   dog.doIt();
>>   dog.doIt();
>>   dog.doIt();
>>   var dog1 = new Dog1();
>>   dog1.doIt();
>>   dog1.doIt();
>>   dog1.doIt();
>>   var dog2 = new Dog2();
>>   dog2.doIt();
>>   dog2.doIt();
>>   dog2.doIt();
>>   var dog3 = new Dog3();
>>   dog3.doIt();
>>   dog3.doIt();
>>   dog3.doIt();
>>   %OptimizeFunctionOnNextCall(dog3.doIt)
>>   dog3.doIt()
>> }
>> test()
>>
>> I got this output:
>>
>> 0x2df967b05180     0  55             push rbp
>> 0x2df967b05181     1  4889e5         REX.W movq rbp,rsp
>> 0x2df967b05184     4  56             push rsi
>> 0x2df967b05185     5  57             push rdi
>> 0x2df967b05186     6  493ba5480c0000 REX.W cmpq rsp,[r13+0xc48]
>> 0x2df967b0518d     d  0f8669000000   jna 0x2df967b051fc  <+0x7c>
>> 0x2df967b05193    13  488b75f8       REX.W movq rsi,[rbp-0x8]
>> 0x2df967b05197    17  48b80000000003000000 REX.W movq rax,0x300000000
>> 0x2df967b051a1    21  488b5510       REX.W movq rdx,[rbp+0x10]
>> 0x2df967b051a5    25  498b8dd0050000 REX.W movq rcx,[r13+0x5d0]
>> 0x2df967b051ac    2c  488bd9         REX.W movq rbx,rcx
>> 0x2df967b051af    2f  e8ccd7f4ff     call 0x2df967a52980
>> (LoadICTrampoline)    ;; code: LOAD_IC
>> 0x2df967b051b4    34  a801           test al,0x1
>> 0x2df967b051b6    36  0f8557000000   jnz 0x2df967b05213  <+0x93>
>> 0x2df967b051bc    3c  488bd8         REX.W movq rbx,rax
>> 0x2df967b051bf    3f  48c1eb20       REX.W shrq rbx, 32
>> 0x2df967b051c3    43  83ebff         subl rbx,0xff
>> 0x2df967b051c6    46  0f804c000000   jo 0x2df967b05218  <+0x98>
>> 0x2df967b051cc    4c  48c1e320       REX.W shlq rbx, 32
>> 0x2df967b051d0    50  48bf0000000005000000 REX.W movq rdi,0x500000000
>> 0x2df967b051da    5a  488b5510       REX.W movq rdx,[rbp+0x10]
>> 0x2df967b051de    5e  498b8dd0050000 REX.W movq rcx,[r13+0x5d0]
>> 0x2df967b051e5    65  488bc3         REX.W movq rax,rbx
>> 0x2df967b051e8    68  488b75f8       REX.W movq rsi,[rbp-0x8]
>> 0x2df967b051ec    6c  e8cf1ff5ff     call 0x2df967a571c0
>> (StoreICStrictTrampoline)    ;; code: STORE_IC
>> 0x2df967b051f1    71  498b45a0       REX.W movq rax,[r13-0x60]
>> 0x2df967b051f5    75  488be5         REX.W movq rsp,rbp
>> 0x2df967b051f8    78  5d             pop rbp
>> 0x2df967b051f9    79  c20800         ret 0x8
>> 0x2df967b051fc    7c  48bbe07c890001000000 REX.W movq rbx,0x100897ce0
>> 0x2df967b05206    86  33c0           xorl rax,rax
>> 0x2df967b05208    88  488b75f8       REX.W movq rsi,[rbp-0x8]
>> 0x2df967b0520c    8c  e88ff4e7ff     call 0x2df9679846a0     ;; code:
>> STUB, CEntryStub, minor: 8
>> 0x2df967b05211    91  eb80           jmp 0x2df967b05193  <+0x13>
>> 0x2df967b05213    93  e8fcedcfff     call 0x2df967804014     ;; debug:
>> deopt position, script offset '130'
>>                                                              ;; debug:
>> deopt position, inlining id '-1'
>>                                                              ;; debug:
>> deopt reason 'not a Smi'
>>                                                              ;; debug:
>> deopt index 2
>>                                                              ;;
>> deoptimization bailout 2
>> 0x2df967b05218    98  e801eecfff     call 0x2df96780401e     ;; debug:
>> deopt position, script offset '130'
>>                                                              ;; debug:
>> deopt position, inlining id '-1'
>>                                                              ;; debug:
>> deopt reason 'overflow'
>>                                                              ;; debug:
>> deopt index 3
>>                                                              ;;
>> deoptimization bailout 3
>> 0x2df967b0521d    9d  90             nop
>> 0x2df967b0521e    9e  90             nop
>> 0x2df967b0521f    9f  90             nop
>> 0x2df967b05220    a0  90             nop
>> 0x2df967b05221    a1  90             nop
>> 0x2df967b05222    a2  90             nop
>> 0x2df967b05223    a3  90             nop
>> 0x2df967b05224    a4  90             nop
>> 0x2df967b05225    a5  90             nop
>> 0x2df967b05226    a6  90             nop
>> 0x2df967b05227    a7  90             nop
>> 0x2df967b05228    a8  90             nop
>> 0x2df967b05229    a9  90             nop
>> 0x2df967b0522a    aa  6690           nop
>>
>> So the result is the same (but without intermediate deopts) - v8 compiles
>> 'this.done++' to slow polymorphic runtime calls LoadICTrampoline and
>> StoreICStrictTrampoline. And if this is not a bug and v8 just works this
>> way, we have slow polymorphic accessing to 'this' object in all base class
>> methods in commonly used inheritance and polymorphism pattern and thus
>> 'extends' keyword come out as performance antipattern
>>
>>
>> On Sunday, September 24, 2017 at 4:31:44 PM UTC+3, Caitlin Potter wrote:
>>>
>>> I think %OptimizeFunctionOnNextCall() is particularly bad for
>>> benchmarks, because you train the function to think the method load is
>>> monomorphic.
>>>
>>> With more type feedback, this should do a bit better, I think we can
>>> inline a finite amount of polymorphic loads.
>>>
>>> At least, I _think_ so.
>>>
>>> On Sep 24, 2017, at 3:12 AM, B. Orlov <bgno...@gmail.com> wrote:
>>>
>>> Ok, there is more details. This is simplified example without loops
>>>
>>> class Animal {
>>>   constructor(){
>>>     this.done = 0
>>>   }
>>>   doIt(){
>>>     this.done++
>>>   }
>>> }
>>>
>>> class Cat extends Animal {}
>>>
>>> class Dog extends Animal {}
>>> class Dog1 extends Animal {}
>>> class Dog2 extends Animal {}
>>> class Dog3 extends Animal {}
>>>
>>> class AnimalCat {
>>>   constructor(){
>>>     this.done = 0
>>>   }
>>>   doIt(){
>>>     this.done++
>>>   }
>>> }
>>>
>>> function test(){
>>>   var cat = new Cat();
>>>   cat.doIt();
>>>   cat.doIt(); //warm
>>>   %OptimizeFunctionOnNextCall(cat.doIt)
>>>   cat.doIt(); //fast REX.W asm instruction for this.done read and write
>>>
>>>   var dog = new Dog();
>>>   dog.doIt(); //deopt - wrong map
>>>   %OptimizeFunctionOnNextCall(dog.doIt)
>>>   dog.doIt(); //the same but now added check for new hidden map - Dog
>>>
>>>   var dog1 = new Dog1();
>>>   dog1.doIt(); //deopt - wrong map
>>>   %OptimizeFunctionOnNextCall(dog1.doIt)
>>>   dog1.doIt() //the same but now added one more check for new hidden map
>>> - Dog1
>>>
>>>   var dog2 = new Dog2();
>>>   dog2.doIt(); //deopt - wrong map
>>>   %OptimizeFunctionOnNextCall(dog2.doIt)
>>>   dog2.doIt() //the same but now added one more check for new hidden map
>>> - Dog2
>>>
>>>   var dog3 = new Dog3();
>>>   dog3.doIt(); //deopt - wrong map
>>>   %OptimizeFunctionOnNextCall(dog3.doIt)
>>>   dog3.doIt() //v8 gives up and compiles this.done to slow calls c++
>>> runtime functions - LoadICTrampoline and StoreICStrictTrampoline
>>> }
>>>
>>> test()
>>>
>>> Running with latest node with arguments "node --trace-deopt
>>> --print-opt-code --allow-natives-syntax test.js" there will be output of
>>> asm instructions for each optimization of doIt() method. After first
>>> optimizations (with only Cat class invoking) I got
>>>
>>> 0x3e9869385180     0  55             push rbp
>>> 0x3e9869385181     1  4889e5         REX.W movq rbp,rsp
>>> 0x3e9869385184     4  56             push rsi
>>> 0x3e9869385185     5  57             push rdi
>>> 0x3e9869385186     6  493ba5480c0000 REX.W cmpq rsp,[r13+0xc48]
>>> 0x3e986938518d     d  0f863f000000   jna 0x3e98693851d2  <+0x52>
>>> 0x3e9869385193    13  488b4510       REX.W movq rax,[rbp+0x10]
>>> 0x3e9869385197    17  a801           test al,0x1
>>> 0x3e9869385199    19  0f844a000000   jz 0x3e98693851e9  <+0x69>
>>> 0x3e986938519f    1f  48bb9112642f9b320000 REX.W movq
>>> rbx,0x329b2f641291    ;; object: 0x329b2f641291 <Map(FAST_HOLEY_ELEMENTS)>
>>> 0x3e98693851a9    29  483958ff       REX.W cmpq [rax-0x1],rbx
>>> 0x3e98693851ad    2d  0f853b000000   jnz 0x3e98693851ee  <+0x6e>
>>> 0x3e98693851b3    33  8b581b         movl rbx,[rax+0x1b]
>>> 0x3e98693851b6    36  83ebff         subl rbx,0xff
>>> 0x3e98693851b9    39  0f8034000000   jo 0x3e98693851f3  <+0x73>
>>> 0x3e98693851bf    3f  48c1e320       REX.W shlq rbx, 32
>>> 0x3e98693851c3    43  48895817       REX.W movq [rax+0x17],rbx
>>> 0x3e98693851c7    47  498b45a0       REX.W movq rax,[r13-0x60]
>>> 0x3e98693851cb    4b  488be5         REX.W movq rsp,rbp
>>> 0x3e98693851ce    4e  5d             pop rbp
>>> 0x3e98693851cf    4f  c20800         ret 0x8
>>>
>>> than after few deoptimizations for Dog2 class V8 compiles doIt() method
>>> to this
>>>
>>> 0x3e9869385540     0  55             push rbp
>>> 0x3e9869385541     1  4889e5         REX.W movq rbp,rsp
>>> 0x3e9869385544     4  56             push rsi
>>> 0x3e9869385545     5  57             push rdi
>>> 0x3e9869385546     6  493ba5480c0000 REX.W cmpq rsp,[r13+0xc48]
>>> 0x3e986938554d     d  0f867b000000   jna 0x3e98693855ce  <+0x8e>
>>> 0x3e9869385553    13  488b4510       REX.W movq rax,[rbp+0x10]
>>> 0x3e9869385557    17  a801           test al,0x1
>>> 0x3e9869385559    19  0f8489000000   jz 0x3e98693855e8  <+0xa8>
>>> 0x3e986938555f    1f  488b58ff       REX.W movq rbx,[rax-0x1]
>>> 0x3e9869385563    23  48ba9112642f9b320000 REX.W movq
>>> rdx,0x329b2f641291    ;; object: 0x329b2f641291 <Map(FAST_HOLEY_ELEMENTS)>
>>> 0x3e986938556d    2d  483bd3         REX.W cmpq rdx,rbx
>>> 0x3e9869385570    30  0f8439000000   jz 0x3e98693855af  <+0x6f>
>>> 0x3e9869385576    36  48ba4914642f9b320000 REX.W movq
>>> rdx,0x329b2f641449    ;; object: 0x329b2f641449 <Map(FAST_HOLEY_ELEMENTS)>
>>> 0x3e9869385580    40  483bd3         REX.W cmpq rdx,rbx
>>> 0x3e9869385583    43  0f8426000000   jz 0x3e98693855af  <+0x6f>
>>> 0x3e9869385589    49  48ba5115642f9b320000 REX.W movq
>>> rdx,0x329b2f641551    ;; object: 0x329b2f641551 <Map(FAST_HOLEY_ELEMENTS)>
>>> 0x3e9869385593    53  483bd3         REX.W cmpq rdx,rbx
>>> 0x3e9869385596    56  0f8413000000   jz 0x3e98693855af  <+0x6f>
>>> 0x3e986938559c    5c  48ba5916642f9b320000 REX.W movq
>>> rdx,0x329b2f641659    ;; object: 0x329b2f641659 <Map(FAST_HOLEY_ELEMENTS)>
>>> 0x3e98693855a6    66  483bd3         REX.W cmpq rdx,rbx
>>> 0x3e98693855a9    69  0f853e000000   jnz 0x3e98693855ed  <+0xad>
>>> 0x3e98693855af    6f  8b581b         movl rbx,[rax+0x1b]
>>> 0x3e98693855b2    72  83ebff         subl rbx,0xff
>>> 0x3e98693855b5    75  0f8037000000   jo 0x3e98693855f2  <+0xb2>
>>> 0x3e98693855bb    7b  48c1e320       REX.W shlq rbx, 32
>>> 0x3e98693855bf    7f  48895817       REX.W movq [rax+0x17],rbx
>>> 0x3e98693855c3    83  498b45a0       REX.W movq rax,[r13-0x60]
>>> 0x3e98693855c7    87  488be5         REX.W movq rsp,rbp
>>> 0x3e98693855ca    8a  5d             pop rbp
>>> 0x3e98693855cb    8b  c20800         ret 0x8
>>>
>>> and on Dog3 class v8 gives up and compiles to this (with slow runtime
>>> calls LoadICTrampoline and StoreICStrictTrampoline)
>>>
>>> 0x3e98693856a0     0  55             push rbp
>>> 0x3e98693856a1     1  4889e5         REX.W movq rbp,rsp
>>> 0x3e98693856a4     4  56             push rsi
>>> 0x3e98693856a5     5  57             push rdi
>>> 0x3e98693856a6     6  493ba5480c0000 REX.W cmpq rsp,[r13+0xc48]
>>> 0x3e98693856ad     d  0f8669000000   jna 0x3e986938571c  <+0x7c>
>>> 0x3e98693856b3    13  488b75f8       REX.W movq rsi,[rbp-0x8]
>>> 0x3e98693856b7    17  48b80000000003000000 REX.W movq rax,0x300000000
>>> 0x3e98693856c1    21  488b5510       REX.W movq rdx,[rbp+0x10]
>>> 0x3e98693856c5    25  498b8dd0050000 REX.W movq rcx,[r13+0x5d0]
>>> 0x3e98693856cc    2c  488bd9         REX.W movq rbx,rcx
>>> 0x3e98693856cf    2f  e8acd2f4ff     call 0x3e98692d2980
>>> (LoadICTrampoline)    ;; code: LOAD_IC
>>> 0x3e98693856d4    34  a801           test al,0x1
>>> 0x3e98693856d6    36  0f8557000000   jnz 0x3e9869385733  <+0x93>
>>> 0x3e98693856dc    3c  488bd8         REX.W movq rbx,rax
>>> 0x3e98693856df    3f  48c1eb20       REX.W shrq rbx, 32
>>> 0x3e98693856e3    43  83ebff         subl rbx,0xff
>>> 0x3e98693856e6    46  0f804c000000   jo 0x3e9869385738  <+0x98>
>>> 0x3e98693856ec    4c  48c1e320       REX.W shlq rbx, 32
>>> 0x3e98693856f0    50  48bf0000000005000000 REX.W movq rdi,0x500000000
>>> 0x3e98693856fa    5a  488b5510       REX.W movq rdx,[rbp+0x10]
>>> 0x3e98693856fe    5e  498b8dd0050000 REX.W movq rcx,[r13+0x5d0]
>>> 0x3e9869385705    65  488bc3         REX.W movq rax,rbx
>>> 0x3e9869385708    68  488b75f8       REX.W movq rsi,[rbp-0x8]
>>> 0x3e986938570c    6c  e8af1af5ff     call 0x3e98692d71c0
>>> (StoreICStrictTrampoline)    ;; code: STORE_IC
>>> 0x3e9869385711    71  498b45a0       REX.W movq rax,[r13-0x60]
>>> 0x3e9869385715    75  488be5         REX.W movq rsp,rbp
>>> 0x3e9869385718    78  5d             pop rbp
>>> 0x3e9869385719    79  c20800         ret 0x8
>>>
>>> So I suspect either there is a bug in v8 or v8 generally can't handle
>>> inheritance and subclassing and always will be compile to slow polymorphic
>>> access to this object in ancestors methods, thus "extends" keyword turns
>>> out as performance antipattern
>>>
>>> On Sunday, September 24, 2017 at 7:33:35 AM UTC+3, Zac Hansen wrote:
>>>>
>>>> Microbenchmarks are infamously difficult to get right as often you're
>>>> not testing what you think you're testing.
>>>>
>>>> Are you sure the optimizer isn't just throwing away code in some cases,
>>>> since you're not actually doing any work with the `done` property?
>>>>  There's no reason that your code even has to run unless I'm reading it
>>>> wrong.  And it's not like C++ where you can look at the generated
>>>> instructions to see what the optimizer is doing..
>>>>
>>>> On Friday, September 22, 2017 at 9:24:00 PM UTC-7, B. Orlov wrote:
>>>>>
>>>>> Take a look at commonly used in oop inheritance pattern for extending
>>>>> base class:
>>>>>
>>>>> class Animal {
>>>>>   constructor(){
>>>>>     this.done = 0
>>>>>   }
>>>>>   doIt(){
>>>>>     this.done++
>>>>>   }
>>>>> }
>>>>>
>>>>> class Cat extends Animal {}
>>>>> class Dog extends Animal {}
>>>>> class Dog1 extends Animal {}
>>>>> class Dog2 extends Animal {}
>>>>> class Dog3 extends Animal {}
>>>>>
>>>>> function testAnimal(animal){
>>>>>   for(var i = 0; i < 100000; i++){
>>>>>     animal.doIt();
>>>>>   }
>>>>> }
>>>>>
>>>>>
>>>>> function test(){
>>>>>   var cat = new Cat();
>>>>>   testAnimal(cat)
>>>>>   var dog = new Dog();
>>>>>   testAnimal(dog)
>>>>>   var dog1 = new Dog1();
>>>>>   testAnimal(dog1)
>>>>>   var dog2 = new Dog2();
>>>>>   testAnimal(dog2)
>>>>>   var dog3 = new Dog3();
>>>>>   testAnimal(dog3)
>>>>> }
>>>>>
>>>>> test()
>>>>>
>>>>> Running in latest node (6.0.287.53 v8 version) I get the following
>>>>> results:
>>>>> Invoking testAnimal function with first descendant class Cat, V8 does
>>>>> a great job by compiling to doIt() method and "this.done++" incrementation
>>>>> to REX.W asm instruction without any calls to slow-runtime c++ function 
>>>>> for
>>>>> generic field access.Than, by invoking doIt() on second descendant class
>>>>> Dog, V8 fall down to doIt() method deoptimization (and all outer functions
>>>>> which also were inlined) and add few ams rew.x and one jz instruction to
>>>>> check hidden map for second Dog class. Than on invoking doIt() on each new
>>>>> subclass v8 again fall down to deoptimization and add new checks for new
>>>>> hidden map and finally on fourth descendant class Dog3 v8 give up and for
>>>>> "this.done++" goes to call LoadICTrampoline StoreICStrictTrampoline c++
>>>>> runtime function for generic access. I hope its only bug and v8 can
>>>>> efficient deal with inheritance and accessing field without slow runtime
>>>>> generic filed access otherwise the big question came in - is "extend"
>>>>> keyword a performance antipattern and why v8 can't implement something 
>>>>> like
>>>>> c++ v-table mechanism ?
>>>>>
>>>> --
>>> --
>>> v8-users mailing list
>>> v8-u...@googlegroups.com
>>> http://groups.google.com/group/v8-users
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "v8-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to v8-users+u...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>> --
> --
> v8-users mailing list
> v8-users@googlegroups.com
> http://groups.google.com/group/v8-users
> ---
> You received this message because you are subscribed to the Google Groups
> "v8-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to v8-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to