On 27/12/10 17:25, Nigel Kersten wrote:
> On Mon, Dec 20, 2010 at 3:26 AM, Brice Figureau
> <[email protected]> wrote:
>> On 20/12/10 05:32, [email protected] wrote:
>>> Hi Brice,
>>>
>>> This really just seems like eye candy to me for those who run top and
>>> not really useful to anyone who wants to know "Why was my puppet
>>> master slow at 3am this morning?".
>>
>> I never pretended this patch would solve this issue :)
>> This is a first step toward a broader solution that I hope can cover
>> this use case.
>>
>>> How about asynchronously &
>>> semantically logging it somewhere to a file or syslog so that those
>>> who want to know can fire up Dashboard/Splunk/perl/python/ruby/awk on
>>> the logs and find the answers.
>>
>> Compilation time and various other information are already logged, if
>> you think there are some missing information, please let us know and
>> we'll add more logs.
>>
>> But I think we need something more powerful and fine-grained, that could
>> capture different metrics than only time spent doing something.
>> Ultimately I'd like to offer a framework that can capture elapsed time,
>> count, cpu time, memory, per probe. What is not clear at this stage is
>> how to get access to those metrics, how long to keep them, how to
>> aggregate them, where probes should be places, and so on...
>>
>> If anyone has any good ideas about these subjects (especially what
>> scenario should be handled), that would be terriffic.
> 
> Something I've seen work particularly well is to implement something
> like Apache mod_status/mod_info, where you have an internal HTTP
> server that exposes various metrics.

That's more or less what I had in mind.
The problem we have to solve, though, is not how to expose this
information (we already have a status indirection that is unused, which
we could leverage), but more what metric would be interesting.
I still plan to produce a patch that would give us a way to collect the
metrics and get access to those. But I still don't have a clear vision
of what metrics could be interesting :(

> I'm not entirely clear how we could do this effectively with our
> common deployment pattern of multiple daemons behind
> Apache/nginx/pound, but I've found this sort of model far more useful
> than just having debug logging in the past, and there are wider
> benefits like being able to scrape this data more easily than parsing
> syslog.

That's correct. My plan regarding multiple process was to not care. I'm
sure it's easy to write any kind of script to accumulate the different
processes metrics.

> I'm not really fond of the process name hack, but I'm not sure I have
> a better suggestion either, so I'm echoing Brice's call for more input
> from the community.

As I said earlier, this is just an experimental (but fun anyway) way to
dig inside an opaque process :)

I'm busy on another puppet experimental project right now, but I'll soon
resume this instrumentation work.
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Reply via email to