Re: [v8-dev] V8 performance question

Jakob Kummerow Wed, 30 Aug 2017 17:29:14 -0700

On Wed, Aug 30, 2017 at 2:04 PM, <[email protected]> wrote:

> valgrind might be more accurate as it  measures actual number of executed
> instructions vs perf counting samples
>


Well, that means it measures something different. Ultimately the number of
executed instructions doesn't matter; what matters is how long it took.


> Are  you aware of any --prof options that will give me more detail about
> the parsing in V8?
>

No. What kind of detail are you looking for? --prof gives you per-function
tick counts, is that not detailed enough?


>
> Thanks.
>
> Sirish
>
>
> On Monday, August 14, 2017 at 1:29:20 PM UTC-5, Jakob Kummerow wrote:
>>
>> Measuring and investigating performance is indeed difficult, and there is
>> no single answer to how best to do it. I haven't heard of valgrind being
>> used for this purpose, and don't know how to make sense of its output. I
>> mostly use V8's builtin --prof and tools/linux-tick-processor, or when
>> that's too coarse, the Linux perf tool (for the latter, see instructions
>> in V8's wiki). --runtime-call-stats can also be highly useful for
>> investigating certain situations.
>>
>> On Mon, Aug 14, 2017 at 9:16 AM, <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I am trying to look at performance of V8. At this time, this is all on
>>> X86. To build V8, I am using NDK compiler that comes when I download V8.
>>> That is pretty straightforward process as mentioned. Once I build "release"
>>> or  "debug" V8, I ran mandreel and codeload Octance benchmarks with
>>> valgrind.
>>>
>>
>> Be aware that Release and Debug mode have vastly different performance
>> profiles. Only Release mode is representative of real-world performance.
>>
>>
>>> Then I use cg_annotate to look at Instruction count, data read/write,
>>> branches etc.  I am just curious where the bottlenecks are. I get the
>>> following numbers for mandreel.
>>>
>>
>> Also, note that "the bottlenecks" can be *very* different depending on
>> what test/benchmark you run.
>>
>>
>>>
>>> ------------------------------------------------------------
>>> --------------------
>>>            Ir            Dr            Dw         Bi          Bc
>>> ------------------------------------------------------------
>>> --------------------
>>> 6,737,556,200 1,975,238,610 1,015,483,012 65,020,470 941,081,112
>>>  PROGRAM TOTALS
>>>
>>> ------------------------------------------------------------
>>> --------------------
>>>            Ir          Dr          Dw         Bi          Bc
>>>  file:function
>>> ------------------------------------------------------------
>>> --------------------
>>> 2,750,360,898 769,151,058 269,297,660 45,068,095 410,353,595  ???:???
>>>   165,387,529  59,780,937  25,440,838      9,340  27,302,136
>>>  ???:v8::internal::Scanner::ScanIdentifierOrKeyword()
>>>   156,426,896  43,240,550  25,588,801          0  17,909,648
>>>  ???:v8::internal::ExpressionClassifier<v8::internal::
>>> ParserTypes<v8::internal::Parser> >::Accumulate(v8::internal::Ex
>>> pressionClassifier<v8::internal::ParserTypes<v8::
>>> internal::Parser> >*, unsigned int, bool)
>>>   150,476,281  44,181,365  28,594,154  3,145,871   9,250,952
>>>  ???:v8::internal::Scanner::Scan()
>>> ..
>>> ..
>>>
>>>
>>> It shows there there are 2.7billion instructions for ???:???. I am
>>> guessing that these are the instructions that are reponsible for hidden
>>> classes, loading/creating/deleting/accessing etc. If my guess is not
>>> correct, please let me know. And then there are usual 165 million for
>>> v8::internal::Scanner::ScanIdentifierOrKeyword(),150 million for scan()
>>> etc.
>>>
>>> However, what is interesting is this - on the subsequent run on the same
>>> machine, I see the following numbers:
>>> ------------------------------------------------------------
>>> --------------------
>>>             Ir            Dr            Dw          Bi            Bc
>>> ------------------------------------------------------------
>>> --------------------
>>> 12,369,840,202 3,844,013,654 1,701,470,472 248,336,756 1,605,561,615
>>>  PROGRAM TOTALS
>>>
>>> ------------------------------------------------------------
>>> --------------------
>>>            Ir            Dr          Dw          Bi          Bc
>>>  file:function
>>> ------------------------------------------------------------
>>> --------------------
>>> 6,361,147,238 2,029,054,548 684,679,708 228,306,958 762,170,252  ???:???
>>>   690,157,426   260,436,764 130,218,388           0  91,152,866
>>>  ???:v8::internal::Runtime_TryInstallOptimizedCode(int,
>>> v8::internal::Object**, v8::internal::Isolate*)
>>>   470,287,848   104,517,164  26,109,566           0  91,422,886
>>>  /build/glibc-bfm8X4/glibc-2.23/nptl/../nptl/pthread_mutex_l
>>> ock.c:pthread_mutex_lock
>>>   365,612,760   104,464,484  13,067,997           0 104,438,144
>>>  /build/glibc-bfm8X4/glibc-2.23/nptl/pthread_mutex_unlock.c:
>>> pthread_mutex_unlock
>>>   364,611,492    91,152,878 104,174,716           0  13,021,838
>>>  ???:v8::internal::StackGuard::CheckAndClearInterrupt(v8::in
>>> ternal::StackGuard::InterruptFlag)
>>>   165,387,529    59,780,937  25,440,838       9,340  27,302,136
>>>  ???:v8::internal::Scanner::ScanIdentifierOrKeyword()
>>>   156,426,896    43,240,550  25,588,801           0  17,909,648
>>>  ???:v8::internal::ExpressionClassifier<v8::internal::
>>> ParserTypes<v8::internal::Parser> >::Accumulate(v8::internal::Ex
>>> pressionClassifier<v8::internal::ParserTypes<v8::
>>>  internal::Parser> >*, unsigned int, bool)
>>>   150,476,281    44,181,365  28,594,154   3,145,871   9,250,952
>>>  ???:v8::internal::Scanner::Scan()
>>>
>>>
>>> All the numbers to the V8 functions remained the same; however all
>>> hidden classes numbers and calls to the library went off the chart. For eg,
>>> instructions dealing with hidden classes went from 2.7 billion to 6.3
>>> billion. I don't see how using the same benchmark would make such huge
>>> difference in the numbers. Can anyone please explain?
>>>
>>> Also, how does V8 community look at performance numbers if not using
>>> standard performance monitoring tools like valgrind? Is there any other way
>>> to look at performance numbers?
>>>
>>> Thanks.
>>> Sirish
>>>
>>> --
>>> --
>>> v8-dev mailing list
>>> [email protected]
>>> http://groups.google.com/group/v8-dev
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "v8-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> --
> v8-dev mailing list
> [email protected]
> http://groups.google.com/group/v8-dev
> ---
> You received this message because you are subscribed to the Google Groups
> "v8-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [v8-dev] V8 performance question

Reply via email to