provided that I haven't had the time to look at this part of the code
yet and that I agree it would be much nicer to have a gmetric-like

> I like using gmetric to monitor... so I wrote gmetric-daemon which
> is my attempt at a forking standalone daemon
> which runs Python metric modules and calls gmetric for each metric...

in a previous email you call upon a "most scalable, most correct and
most reliable/highly available design", which is certainly a valuable
goal that I don't see met by this proposal. A gmetric-daemon as far as
I understand gmetric would defy caching and directives like threshold
and timeout, which are very important at least as far as scalability
goes. Furthermore as long as there are built-int plugins with
collection groups and so on a third party daemon sounds like the wrong
approach to me, so as much easier as it might be at first I'd believe
that the "most scalable, most correct and most reliable design" is the
one Brad proposes cavia the fact that figuring it all out will take
more time.

> I wanted a slightly different multithreaded approach to monitoring...
> but it turns out
> that Python threads really suck.

care to share in which way python threads really suck?

> So I made this a forking daemon.
> One process per module. Not very memory effecient. But then I don't
> expect to need many modules...

*I* don't? what if somebody else does? what if you do tomorrow/at
another job? I don't see how you'd fix something like that at later
stage without having to throw everything away. And how does this meet
the "most scalable" design goal?

Don't get me wrong, I'm sure everybody agrees on the problems and
appreciate the effort, I'm merely pointing out that from my
perspective this proposal doesn't meet the design goals and is
unlikely to get traction upstream or in the HPC community, even tho it
might be just perfect for you and other people. And just in case, I've
no affiliation with ganglia and these are my own opinions, maybe
upstream folks have completely different thoughts.

time and skills permitting I'd be happy to help out with improving the
python interface especially since it's something we'd like to heavily
leverage at work.


