I've been trying to profile a CLI tool I use. I used both node's
--prof/--prof-process and using v8-profiler + chromium to inspect the
results. Both techniques give me the same result, which I can't fully
understand. Because the latter technique is GUI only (!!!), I will just
paste (part of) the result from the first one:
[Summary]:
ticks total nonlib name
11097 60.1% 65.2% C++
5911 32.0% 34.7% JavaScript
1453 7.9% Shared libraries
163 0.9% 1.0% GC
7 0.0% Unaccounted
[C++ entry points]:
ticks cpp total name
3493 28.5% 18.9% v8::internal::
Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8
::internal::Isolate*)
2054 16.7% 11.1% v8::internal::Runtime_Apply(int, v8::internal::
Object**, v8::internal::Isolate*)
1900 15.5% 10.3% v8::internal::Runtime_KeyedGetProperty(int, v8::
internal::Object**, v8::internal::Isolate*)
892 7.3% 4.8% v8::internal::Runtime_BoundFunctionGetBindings(int,
v8::internal::Object**, v8::internal::Isolate*)
[JavaScript]:
ticks total nonlib name
484 2.6% 2.8% LazyCompile: *replace native string.js:146:23
399 2.2% 2.3% Stub: CEntryStub
295 1.6% 1.7% Stub: StringAddStub_CheckNone_NotTenured
269 1.5% 1.6% LazyCompile: ~InnerArrayFilter native array.js:891:
26
209 1.1% 1.2% LazyCompile: ~<anonymous> native v8natives.js:1199:
16
205 1.1% 1.2% Builtin: ArgumentsAdaptorTrampoline
163 0.9% 1.0% LazyCompile: ~tree.Filterset.cloneWith /var/lib/
data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
[Shared libraries]:
ticks total nonlib name
1241 6.7% /usr/bin/nodejs
205 1.1% /lib/x86_64-linux-gnu/libc-2.24.so
3 0.0% /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.24
2 0.0% [vdso]
2 0.0% /lib/x86_64-linux-gnu/libm-2.24.so
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total
amount of its parent calls.
Callers occupying less than 2.0% are not shown.
ticks parent name
1241 6.7% /usr/bin/nodejs
537 43.3% v8::internal::Runtime_Apply(int, v8::internal::Object**,
v8::internal::Isolate*)
534 99.4% LazyCompile: ~<anonymous> native v8natives.js:1199:16
534 100.0% LazyCompile: ~InnerArrayFilter native array.js:891:26
534 100.0% LazyCompile: *filter native array.js:915:21
464 86.9% LazyCompile: *tree.Variable.ev /var/lib/data/
mdione/src/system/osm/carto/lib/carto/tree/variable.js:16:17
55 10.3% LazyCompile: *tree.Operation.ev /var/lib/data/
mdione/src/system/osm/carto/lib/carto/tree/operation.js:16:39
15 2.8% LazyCompile: ~tree.Variable.ev /var/lib/data/
mdione/src/system/osm/carto/lib/carto/tree/variable.js:16:17
355 28.6% v8::internal::Runtime_StringReplaceGlobalRegExpWithString
(int, v8::internal::Object**, v8::internal::Isolate*)
355 100.0% LazyCompile: *replace native string.js:146:23
55 15.5% LazyCompile: *tree.Filterset.addable /var/lib/data/
mdione/src/system/osm/carto/lib/carto/tree/filterset.js:105:44
55 100.0% LazyCompile: *tree.Filterset.cloneWith /var/lib/
data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
41 74.5% LazyCompile: *addRules /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:473:18
12 21.8% LazyCompile: *foldStyle /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:611:19
2 3.6% LazyCompile: ~foldStyle /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:611:19
36 10.1% Handler: 67092480#__default__#line-width {5}
31 86.1% LazyCompile: ~tree.Filterset.cloneWith /var/lib/
data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
22 71.0% LazyCompile: ~addRules /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:473:18
9 29.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:611:19
5 13.9% LazyCompile: *tree.Filterset.cloneWith /var/lib/
data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
3 60.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:611:19
2 40.0% LazyCompile: ~addRules /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:473:18
36 10.1% Handler: 67043328#__default__#line-color
35 97.2% LazyCompile: ~tree.Filterset.cloneWith /var/lib/
data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
24 68.6% LazyCompile: ~addRules /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:473:18
11 31.4% LazyCompile: *foldStyle /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:611:19
1 2.8% LazyCompile: *tree.Filterset.cloneWith /var/lib/
data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
1 100.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:611:19
[...]
You can find a complete tree here:
https://github.com/mapbox/carto/issues/483#issuecomment-317668431
The first thing that strikes me is that the tree is reversed to what
one would expect, but I can live with that. I fact, with the second method
and chromium I can revert the tree.
The second thing is that the line numbers (the first ones just after
the filename; what does the second one mean?) point to the line where the
definition of the function in that output line begins, and not the source
code line where the parent in this tree (not in the call stack) was called.
So, for
1 2.8% LazyCompile: *tree.Filterset.cloneWith /var/lib/
data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
1 100.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/
system/osm/carto/lib/carto/renderer.js:611:19
tree.Filterset.cloneWith() has been called from foldStyle(), but
renderer.js:611 is the line where foldStyle() is defined, not where
cloneWith() was called. This means that I can find the function where the
calls happens (I already could, since I have its name), but not the call
itself. That means that, if there were no JIT, and I had the classic,
naïve, recursive implementation of Fibonacci(), and then called it twice
from the same main() function, once for 5 and once for 100, I wouldn't know
which *call* took more time[1].
But what completely baffles me is the fact that I don't know who
Runtime_Apply() is or what it does; and more important, that there is no
call to anything resembling a replace in tree.Filterset.addable(). Here,
take a look by yourself:
https://github.com/mapbox/carto/blob/master/lib/carto/tree/filterset.js#L105
More questions: why does the tree stop there? Why doesn't it keep
climbing up, showing who called tree.Variable.ev() or addRules()? And what
are those Handlers? Why do tick counts not match between the Bottom Up tree
and the C++ / Javascript summaries? The first one shows 355 ticks
for Runtime_StringReplaceGlobalRegExpWithString and the latter ~3500!
And just to make this post complete, is there any line based profiler,
even one that had a huge impact on the excution time, but that would
provide me with more actionable data? Can I turn off the JIT to
profile the original code? I noticed that if I added some memoizing code in
several of the methods (toString() and toObject() for many of the classes
in lib/carto/tree) I get even slightly *worse* runtimes, so i guess the JIT
is very clever with these methods. Maybe it's implementing memoizing
itself, more efficiently that I could ever do.
I hope you can help me to get at least some of these questions answered
--
[1] To be fair, most probably in the implementation of such function there
would be a line that reads:
return fib(n-1) + fib(n-2);
and the profiler would not be able to distinguish one call from the other.
--
Job board: http://jobs.nodejs.org/
New group rules:
https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules:
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
---
You received this message because you are subscribed to the Google Groups
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/nodejs/db2a7f24-55e9-4a07-96ac-6fdb8746934f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.