On Mon, Dec 14, 2009 at 03:54:20PM +0000, Nicholas Clark wrote:
> So, at work we have some server processes which are glutons for memory.
> We'd like to work out why, which is tricky. At least, I think that it's 
> tricky.
> 
> So, I considered what would be the simplest thing that *might* work.
> Devel::NYTProf can measure the elapsed time at a statement granularity, so
> why not use that framework to also measure memory usage?

You've read my messages in this thread, right?...
http://groups.google.com/group/develnytprof-dev/browse_thread/thread/c711c132216a3cea

With your more detailed knowledge of the internals I'd be grateful for
any details and observations you could add to that thread. (I'm pretty
sure sme of my assumptions are flawed, I didn't research it in any great
depth.)

> I've written a malloc() intereception library,
> This seems to work "reasonably" well

A good solution should be able to account for 'unused' memory in places
like arenas.

> Sadly OS X doesn't have LD_PRELOAD, so you have to explicitly link with the
> interception library.

On OS X you'd use DYLD_INSERT_LIBRARIES.
http://stackoverflow.com/questions/929893/how-can-i-override-malloc-calloc-free-etc-under-os-x

> I'm not sure where to go from here, or whether it's really something that
> Devel::NYTProf would want to support as an option.

It's where I want NYTProf to go for v4, so I'm delighted you're
blazing a trail in that direction.

> Right now the
> implementation (appended) is definately a "prototype":
> 
> * whilst both memory delta and times are written to/read from the profile file
>   the report generation is unchanged - right now the memory delta is bodged
>   in, in place of the time, so the line-by-line columns in "seconds" are
>   actually bytes.
>   (and have large integers, which the formatting code doesn't expect.
>    and can be negative which the colouring code doesn't expected')
> 
>   I'm not sure how to procede here. 1 or 2 more columns in the HTML reports?
>   And hence modification to all the layers between the stream reader and HTML?
> 
> * callbacks dont' change their tag, but have 1 extra parameter with the memory
>   size delta or undef if there isn't a delta present.
>   Likewise the block_line_num and sub_line_num are undef if they're not
>   present, to keep the memory delta at the same position.
> 
>   This is completely untested and feels like a bit of a bodge.
> 
> * hardcoded prototype for get_total_allocated();
> * no good handling of output format overflows
> 
> The appended patch *will* fail tests, because the tests expect (positive)
> seconds, not signed memory deltas. :-)

:)

> I'm still digetsing whether it actually produces useful-enough results to be
> good at pointing out where memory is going. I'm confident that the
> interception library is recording exactly what *is* allocated. However, I'm
> not sure if this is really that useful a metric for solving actual problems.

I'd be interested in your post-digested thoughts on this. Any chance you
could post a link to a report? (If your app isn't open source then
perhaps profile some utility that is, like perldoc (small) or perlcritic
(big)).

The addition of your thoughts to the earlier thread would be good, and
help pin down the issues of what's possible and expand the use-cases.
(Perhaps you could, for example, list the kinds of memory allocations
that don't use arenas. Ops spring to mind but I'm sure there are others.)

Is statement level detail important for memory profiling?
Statement level seems like 'too much detail' to me.

I was thinking of adding memory profiling to the subroutine profiler
rather than the statement profiler. So we'd get inclusive and exclusive
memory usage but only as an end-of-run overview. It would also be less
expensive.

I'm also thinking of optionally adding sub call info into the data
stream as the calls/returns happen. That might be relevant here also.

Overall, though, I suspect we'd get best results from walking the arenas
and optrees (at finish_profile time) and dumping detailed information
about what we find. That sounds simple but to be really useful we'd need
to be able to do things like tell which SVs are in pads of which subs in
which packages.

Tim.

p.s. Is integer byte level detail important for memory profiling?
I was thinking of using doubles instead of ints (so we'd avoid
overflow problems).

-- 
You've received this message because you are subscribed to
the Devel::NYTProf Development User group.

Group hosted at:  http://groups.google.com/group/develnytprof-dev
Project hosted at:  http://perl-devel-nytprof.googlecode.com
CPAN distribution:  http://search.cpan.org/dist/Devel-NYTProf

To post, email:  [email protected]
To unsubscribe, email:  [email protected]

Reply via email to