After quite a bit of digging, I found "mtrace" which is part of GCC on
Linux and provides a memory trace log which can be used to list
outstanding allocations.  I whipped up a native module exposing access
to this, and am now able to get a reasonable heap dump (or, rather,
heap delta from when starting/stopping tracing, which is perhaps more
useful since I can just start it going after the server has spun up
all of its initial loading).

When linking against C++ code, it's not particularly useful, because
the primary call site for all allocations is "operator new", but for C-
like code, it works great.  I replaced a bunch of unnecessary "new"
statements with some "mallocs" in our offending module and this
identified the particular leak immediately.

The module isn't NPMified, but I threw it up on GitHub for anyone who
wants to play with it.  It also includes a mtrace log parser written
in node to generate high-level summary information on outstanding
allocs.

https://github.com/Jimbly/node-mtrace

On Mar 16, 1:46 pm, Jimb Esser <[email protected]> wrote:
> The server in question is not an http server, but a back-end
> simulation server running physics simulation for an online game using
> a Bullet native library.  Yeah, I know, not exactly a typical (or
> perhaps wise...) use of node.js.  We're 95% certain the Bullet module
> is the culprit, but that is hundreds of thousands of lines of 3rd
> party C++ code, not something feasible to poke in, and, like most
> physics simulations, not particularly deterministic when combined with
> the randomness of network latency and real user actions.  Stand alone
> stress tests we've tried never exhibit the problem, and since it takes
> a fully loaded server a day to exhibit it, it's not likely to
> reproduce in a development environment.  That being said, it
> consistently does reproduce on the production servers, so that is,
> theoretically, an easy way to debug it with post-mortem debugging
> (albeit with a day-long turn-around to test fixes).  Heap dumps are a
> much more reliable way of tracking down heap issues in a large system
> than any "poke at different parts of the system at random" method, I
> was just hoping there was an easy way to get them reliably...
>
> I'll try poking around in a gdb dump, although I'm guessing the
> default heap isn't going to have any allocation site information on
> the heap entries, but it might show some useful information, at least
> it should allow me to quickly sample the heap to determine what the
> primary content type is (strings, floats, ints, etc) is that's
> leaking, which may provide some insight.
>

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to