Hi folks,

I'm Attila, and I'm an engineer at Datadog working on our Node.js
profiler[0]. (As a side note, I'd be happy to sometimes give a larger
overview of how we do V8 profiling at Datadog.)

TL;DR: if we interrupt V8 with a signal while it's executing
HandleScope::DeleteExtensions(), a call to Isolate::GetCurrentContext()
will crash even though Isolate::InContext() returns true.

The whole story:

We mostly rely on V8 built-in CpuProfiler APIs, except we also try to
gather some more data on each sample, so we intercept the PROF signal
handler and do some work before and after delegating to the V8's original
PROF handler. E.g. we also gather elapsed thread CPU time and recently we
started to gather the value of
node::AsyncHooksGetExecutionAsyncId(Isolate*) (I get this is a V8 and not a
Node list – my question ultimately has nothing to do with Node.js per se.)

Now I know you'd say that invoking random methods from a signal handler is
obviously a recipe for disaster, but in this case this API is just a few
dependent loads through a chain of pointers that doesn't change much if at
all. If I pseudo-unrolled what this Node.js method does, then it'd be
something like this (omitting a bunch of defensive checks for isEmpty etc.)

if (isolate->InContext()) {
  auto ctx = isolate->GetCurrentContext();
  auto env = ctx ->GetAlignedPointerFromEmbedderData(NodeEnvironmentIndex);
  return env->async_hooks_->async_id_fields[AsyncHooks::kAsyncIdCounter];
}

The thing is, it does work – a very obvious case when it wouldn't work is
if the PROF signal handler interrupted V8 while it's collecting garbage,
but we do guard against it by installing GC prologue and epilogue callbacks
and not touching the engine when we know it's GCing.

However, we see one case when it doesn't work, and I can't figure it out on
my own so I'm kindly asking for some eyeballs here. Namely, we're seeing a
crash with a stack trace of:

Isolate::GetCurrentContext()
node::AsyncHooksGetExecutionAsyncId(Isolate*)
dd::SignalHandler::HandleProfilerSignal(…)
v8::internal::HandleScope::DeleteExtensions(Isolate*)
node::Environment::RunTimers(…)
uv__run_timers
uv_run

So node::Environment::RunTimers has a HandleScope instance[1]; this
HandleScope is being destroyed when it's going out of C++ scope, which
should be just before the method returns. As part of the destruction
process, DeleteExtensions is invoked. Our PROF handler interrupts here, and
calls node::AsyncHooksGetExecutionAsyncId() which will invoke
Isolate::InContext() and receive true, but then it'll crash in
Isolate::GetCurrentContext().

We see repeated crashes with this exact stack trace. There's something in
HandleScope::DeleteExtensions() that causes Isolate::GetCurrentContext() to
crash even though Isolate::InContext() returned true. If anyone knows off
the top of their head what causes this, it'd be appreciated. If we can fix
it, even better. I'd settle for being able to observe this state and bail
out without crashing.

Again, I do understand that signal handlers can't expect to observe data in
an internally consistent state. FWIW, I have a hunch that
Isolate::InContext should really be returning false in this situation since
at the time the handle scope is destroyed, there's no JS code executing at
all, it's about to return to libuv's run_timers. There are a bunch of
Call()s[2] in JS functions in a loop in that method, but by the time the
handle scope is destroyed, execution is outside of 'em all. The code of
RunTimers seems sufficiently complex that this shouldn't be explainable by
compiler inlining and reordering.

Again, any insight appreciated, and I'm happy to work more on this with
anyone.

Thanks,
  Attila.

--
[0]: https://github.com/DataDog/pprof-nodejs
[1]:
https://github.com/nodejs/node/blob/9bbbe60f6b0001964cb62182ce17e7f3980ccf06/src/env.cc#L1490
[2]:
https://github.com/nodejs/node/blob/9bbbe60f6b0001964cb62182ce17e7f3980ccf06/src/env.cc#L1505

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/v8-dev/CADatEgZeYsjmuBpyGjgSo6%3DC%3DKK2ZsFrSk-sb1S1fBDzYi%2Br%3DQ%40mail.gmail.com.

Reply via email to