After quite a bit of digging, I found "mtrace" which is part of GCC on Linux and provides a memory trace log which can be used to list outstanding allocations. I whipped up a native module exposing access to this, and am now able to get a reasonable heap dump (or, rather, heap delta from when starting/stopping tracing, which is perhaps more useful since I can just start it going after the server has spun up all of its initial loading).
When linking against C++ code, it's not particularly useful, because the primary call site for all allocations is "operator new", but for C- like code, it works great. I replaced a bunch of unnecessary "new" statements with some "mallocs" in our offending module and this identified the particular leak immediately. The module isn't NPMified, but I threw it up on GitHub for anyone who wants to play with it. It also includes a mtrace log parser written in node to generate high-level summary information on outstanding allocs. https://github.com/Jimbly/node-mtrace On Mar 16, 1:46 pm, Jimb Esser <[email protected]> wrote: > The server in question is not an http server, but a back-end > simulation server running physics simulation for an online game using > a Bullet native library. Yeah, I know, not exactly a typical (or > perhaps wise...) use of node.js. We're 95% certain the Bullet module > is the culprit, but that is hundreds of thousands of lines of 3rd > party C++ code, not something feasible to poke in, and, like most > physics simulations, not particularly deterministic when combined with > the randomness of network latency and real user actions. Stand alone > stress tests we've tried never exhibit the problem, and since it takes > a fully loaded server a day to exhibit it, it's not likely to > reproduce in a development environment. That being said, it > consistently does reproduce on the production servers, so that is, > theoretically, an easy way to debug it with post-mortem debugging > (albeit with a day-long turn-around to test fixes). Heap dumps are a > much more reliable way of tracking down heap issues in a large system > than any "poke at different parts of the system at random" method, I > was just hoping there was an easy way to get them reliably... > > I'll try poking around in a gdb dump, although I'm guessing the > default heap isn't going to have any allocation site information on > the heap entries, but it might show some useful information, at least > it should allow me to quickly sample the heap to determine what the > primary content type is (strings, floats, ints, etc) is that's > leaking, which may provide some insight. > -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
