My sincerest apologies for not making it over today, I will be showing my mug there tomorrow.
My vote on this: " - Add some event notification mechanism if metrics go over a limit. But do we want to implement another Nagios?" No, no, no, no and no. :) All opinion here, but: I think to add event notification would be a major mistake, and would pull attention off of what makes ganglia awesome, which is the non-judgemental recording of system metrics. There already exist a lot of ways to get ganglia's metrics into Nagios, which has all of the bits that you'd want for a notification system. There are so many more cool/good/appropriate things to get into ganglia than event notification. -- john allspaw flickr.com ----- Original Message ---- > From: Martin Knoblauch <[EMAIL PROTECTED]> > To: Brad Nicholes <[EMAIL PROTECTED]>; Peter Mui <[EMAIL PROTECTED]>; > [email protected] > Sent: Thursday, February 28, 2008 4:01:45 PM > Subject: Re: [Ganglia-developers] Ganglia 3.1 wish list... > > Hi folks, > > before I turn off the light, just one or two comments below. > > See/hear you tomorrow > Martin > > > From: Brad Nicholes > > To: Peter Mui ; > [email protected] > > Sent: Thursday, February 28, 2008 11:43:33 PM > > Subject: [Ganglia-developers] Ganglia 3.1 wish list... > > > > Here is the latest Ganglia 3.1 wish list. We will be discussing this list > > during the Ganglia meeting. > > > > Brad > > > > > > > > -----Inline Attachment Follows----- > > > > Done > > ------------------ > > - C module interface as DSO > > - mod_python Python module interface > > - Dynamically link libraries like expat, apr, libconfuse > > - Add TITLE attribute to the XDR data to communicate a human readable name > > - Add a GROUP attribute to the XDR data > > This would allow metrics to declare the category that they belong to. > > The > > category should be added at the metric definition level and not in the > .conf > > file. > > - Reimplement the built in metrics as C interface modules > > - A cleaner XDR encoding: > > The current encoding scheme embeds too much information about which > metrics > > gmond collects. The encoding scheme should treat all metrics the same: > > as > > just "a metric". The encoding should not care if the metric is > > metric_cpu_speed, metric_swap_total or a user-defined "gmetric" one. > > - Flexible method of adding extra metric metadata. > > We could include extra metadata, not just "alias"/"title". For > > example, > > some > > metrics have a natural minimum and maximum value. Perhaps coming up > > with > an > > extendable way of encoding metric metadata so future changes can be > included > > without loosing backwards compatibility. > > - Re-organization of RPM packages (libganglia, gmond-python ?) > > > > > > GMond To Do > > ------------------------ > > - Gmond module repository > > - Implement a perl module interface > > - Implement a PHP module interface > > - Implement a Ruby module interface > > - Metric packing: > > Simply that a UDP packet can contain multiple metrics (using the usual > > XDR > > stream decoding) up to the size of a UDP packet. This would help reduce > > the overheads when sending many metric updates concurrently. It also > > preserves the current gmond behaviour where it sends metric updates in > > a single UDP packet. > > - Support for counters (metrics with +ve slope) > > This shouldn't require much work (from memory, make sure the slope-type > > information is preserved and patch gmetad to create RRD files with the > > correct options). Currently Ganglia doesn't actually support custom > > counter metrics, which is an awkward limitation. > > - gmond switching to a non-blocking IO model. > > If there's a large number of metric updates then gmond must process them > > "quickly" or they will be lost. If this happens whilst gmond is > > sending > XML > > data to gmetad there's may be a delay, increasing the risk of metric > > update messages being lost. Switching to a non-blocking IO model would > > allow > > gmond to respond preferentially to the incoming UDP messages. > > -* Remove the 4T limit on ganglia metric results > > -* Modify all byte count metric to 8 bytes ints > > > > GMetad To Do > > ------------------------------ > > - Support for new RRDTool which allows graphs to have dynamic sizes > > - Gilad's stacked graphs > > - Changing the units of default metrics to their base > > For example disk_free's base unit should be bytes, not GB as rrdtool > > will > > automatically append G,M,K etc.) > > - Better support for bigger less frequent updates > > one packet every 20 seconds per host for all data? > > - Multi PB disk limit > > - Better on disk RRD perf (tmpfs is an OK workaround) > > -* Name RRD directories based on UUID generated by client gmond > > has of MAC address? something else? So that renaming hosts, updating > > DNS > or > > hosts files don't result in history for the phyiscal gmond client being > > lost. > > - Integration of gexec/authd ? > > - Could be interesting as some kind of lightweight queueing system. > > > - Expand gstat nodelist parameter query options (i.e. return all hosts > > with <10% iowait, etc.) > > - Add some event notification mechanism if metrics go over a limit. But do > we > want to implement another Nagios? > > > - Interface stats in bits? Self awareness of interface capablity for % > > util stats for network. > > - Link utilization would be a great metrics. > - I am not sure about the bit-stats. For the stuff I do, throughput in > bytes/sec makes more sense than a bit-rate. But I can see the comms people > have > a different view > - the network stats should be per interface > > > - Something like a unique per-gmond instance identifier > > To help with multi-homing and DNS issues and so the IP address is no > > longer the index key. There was discussion of this under the subject > > "Overriding hostname" on the Ganglia-general list. > > - Give some metrics priority and have them updated more frequently in their > RRDs > > than others. > > - Allow for some sort of in memory RRD (never written to disk) as an > alternative > > storage for very extreme cases. > > - Let the users manage different IO bound pools for their metrics > > For extreme cases one based on tmpfs. So that they can be tied > > correctly > > to the right kind of storage IO capabilities for the frequency needed. > > - Add more memory metrics > > slab, buffers, dirty, writeback, cache_clean (= cached - > dirty+writeback)), > > mapped, free > > > > Web interface > > ------------------------------- > > - Numerous custom graphs enhancements (Alex Balk, Timothy Witham, others) > > - Web frontend face lift > > - Mouse over result graphs > > - Default cluster view uses text-only per host squares > > loading 1700 little graphs chews too much browser > > - Better icons. > > The current highly-compressed JPEG files for the icons look horrible! > > Line-art perhaps suffers worst from JPEG compression artifacts. Could > > we > > not > > use either PNGs or (preferably) SVG? > > > > - Add an option to allow switching to SVG in-line RRDTool graphs. > > This should be pretty easy to add as a config option. I think support > > for > > SVG in current browsers is now "good enough". A half-way modern > > version > of > > RRDTool can generate SVG versions of the graphs, which should look much > > better. > > > > - Have some standard way of describing custom graphs. > > There currently isn't a standard way of producing custom graphs; > > "custom" > > here means adding support for host-specific and cluster-specific graphs > and > > also some framework for describing those custom graphs. I have a > > solution, that (at least) has merit in both existing and working. > > Perhaps > > > it > > isn't ideal, but the Ganglia web front-end should provide at least some > > standard hooks if not an actual framework. > > > > - Have the option to switch off displaying all the single-metric graphs. > > If you have ~300 metrics, the little graphs at the page bottom are all > > but > > useless. They slow down the loading of the page without adding much > > insight. > > (I have a simple patch that allows a user to choose whether they want > > to > see > > these graphs.) > > > > - Fix the pie-chart-generating code. > > The current pie-chart code is a bit ugly and can plot things incorrectly > > under certain circumstances. There must be some nicer graph plotting > > packages out there... > > > > > > > > > > > > > > > > > > -----Inline Attachment Follows----- > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > > > > > -----Inline Attachment Follows----- > > > > _______________________________________________ > > Ganglia-developers mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > > > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Ganglia-developers mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > ____________________________________________________________________________________ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Ganglia-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-developers
