Your problem seems to originate from node-weak. With "normal" node modules, the code that gets executed during one turn of the event loop is 100% related to the JS calls that you are making in the response to the event. Nothing interferes. If other I/O operations are pending they won't interfere with your event; their callbacks will be called on another turn of the event loop. This is completely non-preemptive.
But node-weak is a very odd beast. It introduces a callback which is called by the garbage collector and the garbage collector may run at any time, and will preempt what your JS code is doing. As a result, you have some kind of odd stack that pops up at random places in your code. You should be very cautious about what you do in a node-weak callback. Handling such a callback is very much like handling an interrupt in driver code: do as little as possible: incrementing a counter or setting a flag is fine but you should not be calling complex code that's going to interact with the event loop. Bruno On Friday, October 10, 2014 11:30:23 AM UTC+2, Matt Ginzton wrote: > > Thanks for engaging on this. > > That does make sense, and yeah, my model for when JS code runs in Node is > basically "in response to some asynchronous event": which could be I/O, or > also things like setTimeout or nextTick (or the timers module). The node > process will be asleep in the libuv event loop, wake up in order to run > some callback, and the code in that callback will get to run until it's > done (including anything it calls synchronously, which doesn't require I/O > or get deferred via nextTick or timers); it shouldn't get preempted; > eventually this sequence of synchronous execution will be done and Node > will find something else to do (another timer or event handler) or go back > to sleep. > > Upon more experimentation, I don't think this has anything to do with > node-fibers (more on this below); I just mentioned that because we are > using it and it might affect things, but I no longer think it's relevant > here. I should also have mentioned we're using node-weak > <https://github.com/TooTallNate/node-weak>, which is going to be relevant > here (it's one way but not the only way of getting the behavior in > question). What this boils down to is, I have a fundamental question about > the design and intended use of MakeCallback, whether I'm right about the > behavior I'm going to describe, and whether this is intentional. > > I think I can further boil down my understanding of Node's asynchronous > concurrency model and consistency guarantees and need for locking (or not) > to: "no preemption". At least, that's my read on what people want from > Node, and what I want from Node. (Following on what I said before about > every synchronous invocation being a critical section until it chooses to > yield: State can't change out from under you in the middle of one > synchronous invocation. If you yield and then resume execution, "on the > other side of an asynchronous I/O operation" as you said, state can and > does change out from under you, because if code A schedules callback B, it > can't know what will run in between A and B. And that's fine, because we > all know it works that way, and B can't make any assumptions about the > state of the world being exactly as A left it. But each statement in A > should be able to assume that the state of the world is exactly as the > previous statement left it.) > > This "no preemption" property would require that if I'm running some > synchronous code, it gets to decide what other code runs, until it decides > to yield execution by returning to the event dispatcher. In the middle of > what I'm calling a flow of synchronous execution, some other piece of code > it doesn't know about can't jump on the CPU, change some data structures, > and return control to the first code in an inconsistent state. For this to > be true, we should only do event dispatch (calling callbacks) from the > toplevel event loop; there shouldn't be contexts nested within arbitrary > application code that can dispatch into other arbitrary code. > > MakeCallback is where Node invokes nextTick callbacks, and it is called > from the toplevel event loop (well, right before returning to the libuv > event loop), but if my understanding is correct, it's also called on every > transition from JS code back to C++ code -- which is a superset of > "returning to the libuv event loop". That is, if "outer" JS code calls into > a native-code Node extension, and that native-code extension calls back > into "inner" JS via MakeCallback, when that inner JS returns to C++, > MakeCallback invokes process._tickCallback before returning, and the outer > JS has been preempted by any nextTick callbacks that get invoked here, and > it can't really reason about the state of the world (specifically, any > consistency guarantees on shared data structures) after that. (Side > question here: is MakeCallback *a* way for extension/C++ code to invoke JS > code, or *the* way for extension/C++ code to invoke JS code? That is, from > a native-code extension that wants to invoke JS code, is all JS code > "callbacks" or only some of it?) > > This, to me, seems to break the entire Node concurrency model, and I'm > surprised more people haven't been bitten by it, but maybe I'm > misunderstanding things, maybe there's a good reason for it, maybe > MakeCallback isn't supposed to get used this way, or maybe it's just a bug > that's enough of a corner case that it hasn't been noticed. > > Let me restate the problematic flow, and then I'll get into examples. A > normal Node app might start off as pure JS; outside Node's own codebase all > the app code is JS running as event handlers, and the flow is (1) event > loop -> (2) handlers in JS -> return to (1). Then you add some native-code > Node extensions, and JS code can call out to C++ code in these extensions, > and now the flow is (1) event loop -> (2) handlers in JS -> (3) callouts in > C++ extension -> return to (2) -> return to (1). And say some of these > native-code Node extensions want to execute JS and use MakeCallback to do > it (again, I don't know that this is the only way native code gets to > execute JS code, but I do know it's one way); and now the flow is (1) event > loop -> (2) handlers in JS -> (3) callouts in C++ extension -> (4) more JS > code -> return to (3) -> return to (2) -> return to (1). And obviously more > complex flows are possible; there could be even deeper nested transitions > between JS and native code, or bounce between (2)<>(3) multiple times > before returning to (1), but the 1234321 case is already enough to > illustrate the preemption problem. Every time we invoke MakeCallback, on > the way out it executes process._tickCallback. This happens at the boundary > from (2) back to (1), which is not a problem, but also at the boundary from > (4) back to (3), which is a problem. > > Here's a demo of the problem > <https://gist.github.com/metamatt/99ef2a5e12de2acd5e39>. (Note that this > doesn't use node-fibers. It does use node-weak, but just as a way of > forcing invocation of MakeCallback. Any other extension that calls > MakeCallback could exhibit the same behavior.) > > Here are examples of node-fibers usage, including examples of what the > call stacks look like > <https://gist.github.com/metamatt/2465dfd76213a98e22e0>. (Again, not > relevant to the preemption issue I'm harping on; just following up on what > would otherwise be a loose end from my original post, if anyone's curious. > The point is that the stack trace inside a fiber doesn't see outside the > fiber boundary, unless you write your own stack-stitching code, so you > don't see _tickCallback in the call stack even if it was used to invoke the > fiber. But you still wouldn't expect to see _tickCallback higher in the > call stack.) Using fibers, one certainly ends up with a different > concurrency model than the normal Node one, and you can bring all sorts of > problems on yourself this way if you do it wrong, but I don't consider it > to break my "no preemption" rule because it's entirely under your control; > you can get descheduled and then find the world has changed out from under > you when you get rescheduled, but you actually have to ask for this to > happen by calling Fiber.yield or Fiber.run. > > Boiling all this back down to one question, it's this: is it intentional > that inner invocations of MakeCallback do dispatch to _tickCallback (thus > preempting any code that was running higher on the call stack), or can this > be changed? IMHO it would be preferable, and would solve this problem, if > we called process._tickCallback only from the very outermost transition > from JS back to C++, not on every such transition. But maybe this causes > other problems I'm not seeing. Or, should extensions that are invoked as > callouts from JS code (as opposed to registering event handlers, which > seems to be how the Node source uses MakeCallback) not be using > MakeCallback in the first place? > > As things stand, it seems like any extension that uses MakeCallback should > be labeled "somewhat dangerous", because it can preempt the code that calls > it. (But if this is documented and advertised properly, I as a client of > that extension can code around it; I just have to realize that certain > entry points into the extension are a yield/redispatch point from the point > of view of the calling code, and if I know where those are, I can code > around it.) Meanwhile node-weak (and I don't mean to malign it, it's really > cool and useful functionality) should be labeled "quite a lot more > dangerous", because not only can it cause preemption for those same > reasons, but you can't document any specific entry points or surface area > that can cause that preemption: that surface area is GC, which can happen > at any time. > > thanks, > > Matt > > -- Job board: http://jobs.nodejs.org/ New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines --- You received this message because you are subscribed to the Google Groups "nodejs" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/nodejs/3bb0d836-2c8a-43da-b7b3-9e560fa3eb90%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
