Hi Brad, that seems to be a pretty useful move. Seems it is time that I really start looking closely at 3.1.x
Cheers Martin ---------------------------------------------------- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de ----- Original Message ---- > From: Brad Nicholes <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; ganglia-general@lists.sourceforge.net > Sent: Tuesday, December 18, 2007 11:44:45 PM > Subject: [Ganglia-developers] Moving all built-in metrics to metric modules... > > I just committed a rather substantial patch to Ganglia 3.1.0 > trunk > which will affect the way that gmond 3.1.x is deployed. I am > posting > this to both the developer list and the general list so that all will > be > aware of the changes and why they are important. The primary > purpose > for the patch was to remove all of the built in metrics out of the > gmond > binary and allow them to be built as loadable modules. The > following > is a more detailed list of what has changed. Hopefully from a > user > perspective, gmond will continue to work as it has in the past. But > going > forward, it will be much more flexible with regards to the core set > of > metrics. > > * All built-in metrics have been removed from the gmond binary > - A new set of core metric modules have been created that > represent > the same set metrics that gmond has always gathered. These new > core > modules are mod_cpu.so, mod_disk.so, mod_load.so, mod_mem.so, > mod_net.so, > mod_proc.so and mod_sys.so. Each of these modules is basically > a > wrapper around the metric functions that exist in libmetrics. > Being > wrappers, they still make the same metric function calls as have always > been > made. And since libmetrics contains all of the platform specific > metric > code, the metric function calls made by the core modules will > continue > to do the right thing for all of the platforms that have > been > previously supported. > - There is also an extra module called core_metrics which contains > the > heartbeat, location and gexec metrics. Even though this module > could > be dynamically loaded in the same manner as the others, it is > always > statically linked simply because gmond would not be able to > function > properly without these metrics so there is no real reason to allow > these > metrics to be dynamically loaded. > - Some additional configuration has been added to the > gmond.conf > file. Because the core metrics are now implemented as modules, > this > requires a module configuration block that instructs gmond to load > each > module. A set of module blocks has been added to the default > gmond.conf > file. > > * All metric specific metadata definitions have been removed > from > protocol.x > - With the refactoring of the XDR data and removal of the > builtin > metrics, there is no longer any need for XDR to have intimate > knowledge > of the core metrics. Therefore the metric structure array and enum > have > been removed and are now part of the core metric modules themselves. > > * --enable-static-build statically links the core metric modules > - Building gmond statically will statically link not only APR, > expat > and libconfuse, it will also statically link all of the core > metric > modules into the gmond binary. The result should be a gmond binary > that > looks and feels just like the old 3.0.x statically linked gmond > binary. > The one exception is that a module statement is still required in > the > gmond.conf file. The difference between the module > configuration > block for dynamically loaded modules and the module blocks for > statically > linked modules is whether or not a path to the .so is included. > The > configure script and makefiles have been modified to > detect > --enable-static-build and build the default gmond.conf file appropriately. > > * --enable-static-build + --enable-python statically links the > python > module > - One of the downsides of building gmond 3.1.x statically was > that > doing so would disable all of the dynamically loadable module > capability. > The reason for this is the need for both gmond and the > pluggable > modules to dynamically link with libapr1. However, if > both > --enable-static-build and --enable-python are specified during configure, a > gmond > binary will be built with mod_python statically linked. This > provides > gmond with the ability to continue to load and run python metric modules > in > the same manner as the non-static build. In other words, even > though > statically linking gmond will disable pluggable C interface > modules, > python pluggable modules will still continue to work. > > * All metrics carry a group designation > - Now that all metrics have been implemented as loadable modules, > the > metrics have also been assigned to groups. The XML that is > produced > by gmond and gmetad will carry an tag > that > defines which group each metric belongs to. This will allow the web > front > end to be enhanced to filter metrics so that they can be display > by > group rather than all metric graphs appearing on the same page. > > > These changes should make gmond much more flexible when it comes > to > extending or replacing not only the core metrics but also new metrics. > I > have attached the wish list that was compiled a couple of months > ago > which updates the items that I consider to be done. As I mentioned > at > our meet-up a few weeks ago, we need to identify which of the > remaining > items must be addressed before shipping 3.1.0 and get those > completed. > I would like to see us ship a 3.1.0 release as soon as possible. > > Brad > > > > -----Inline Attachment Follows----- > > Done > ------------------ > - C module interface as DSO > - mod_python Python module interface > - Dynamically link libraries like expat, apr, libconfuse > - Add TITLE attribute to the XDR data to communicate a human > readable > name > - Add a GROUP attribute to the XDR data > This would allow metrics to declare the category that they > belong > to. The > category should be added at the metric definition level and not > in > the .conf file. > - Reimplement the built in metrics as C interface modules > - A cleaner XDR encoding: > The current encoding scheme embeds too much information about > which > metrics > gmond collects. The encoding scheme should treat all metrics > the > same: as > just "a metric". The encoding should not care if the metric is > metric_cpu_speed, metric_swap_total or a user-defined > "gmetric" > one. > - Flexible method of adding extra metric metadata. > We could include extra metadata, not just "alias"/"title". > For > example, some > metrics have a natural minimum and maximum value. Perhaps > coming > up with an > extendable way of encoding metric metadata so future changes can > be > included > without loosing backwards compatibility. > - Re-organization of RPM packages (libganglia, gmond-python ?) > > > GMond To Do > ------------------------ > - Gmond module repository > - Implement a perl module interface > - Implement a PHP module interface > - Implement a Ruby module interface > - Metric packing: > Simply that a UDP packet can contain multiple metrics (using > the > usual XDR > stream decoding) up to the size of a UDP packet. This would > help > reduce > the overheads when sending many metric updates concurrently. > It > also > preserves the current gmond behaviour where it sends metric > updates > in > a single UDP packet. > - Support for counters (metrics with +ve slope) > This shouldn't require much work (from memory, make sure > the > slope-type > information is preserved and patch gmetad to create RRD files > with > the > correct options). Currently Ganglia doesn't actually > support > custom > counter metrics, which is an awkward limitation. > - gmond switching to a non-blocking IO model. > If there's a large number of metric updates then gmond must > process > them > "quickly" or they will be lost. If this happens whilst gmond > is > sending XML > data to gmetad there's may be a delay, increasing the risk > of > metric > update messages being lost. Switching to a non-blocking IO > model > would allow > gmond to respond preferentially to the incoming UDP messages. > -* Remove the 4T limit on ganglia metric results > -* Modify all byte count metric to 8 bytes ints > > GMetad To Do > ------------------------------ > - Support for new RRDTool which allows graphs to have dynamic sizes > - Gilad's stacked graphs > - Changing the units of default metrics to their base > For example disk_free's base unit should be bytes, not GB > as > rrdtool will > automatically append G,M,K etc.) > - Better support for bigger less frequent updates > one packet every 20 seconds per host for all data? > - Multi PB disk limit > - Better on disk RRD perf (tmpfs is an OK workaround) > -* Name RRD directories based on UUID generated by client gmond > has of MAC address? something else? So that renaming > hosts, > updating DNS or > hosts files don't result in history for the phyiscal gmond > client > being lost. > - Integration of gexec/authd ? > - Expand gstat nodelist parameter query options (i.e. return all hosts > with <10% iowait, etc.) > - Interface stats in bits? Self awareness of interface capablity for % > util stats for network. > - Something like a unique per-gmond instance identifier > To help with multi-homing and DNS issues and so the IP address > is > no > longer the index key. There was discussion of this under > the > subject > "Overriding hostname" on the Ganglia-general list. > - Give some metrics priority and have them updated more frequently > in > their RRDs than others. > - Allow for some sort of in memory RRD (never written to disk) as > an > alternative storage for very extreme cases. > - Let the users manage different IO bound pools for their metrics > For extreme cases one based on tmpfs. So that they can be > tied > correctly > to the right kind of storage IO capabilities for the > frequency > needed. > - Add more memory metrics > slab, buffers, dirty, writeback, cache_clean (= cached > - > dirty+writeback)), mapped, free > > Web interface > ------------------------------- > - Numerous custom graphs enhancements (Alex Balk, Timothy > Witham, > others) > - Web frontend face lift > - Mouse over result graphs > - Default cluster view uses text-only per host squares > loading 1700 little graphs chews too much browser > - Better icons. > The current highly-compressed JPEG files for the icons > look > horrible! > Line-art perhaps suffers worst from JPEG compression > artifacts. > Could we not > use either PNGs or (preferably) SVG? > > - Add an option to allow switching to SVG in-line RRDTool graphs. > This should be pretty easy to add as a config option. I > think > support for > SVG in current browsers is now "good enough". A half-way > modern > version of > RRDTool can generate SVG versions of the graphs, which should > look > much > better. > > - Have some standard way of describing custom graphs. > There currently isn't a standard way of producing custom > graphs; > "custom" > here means adding support for host-specific and > cluster-specific > graphs and > also some framework for describing those custom graphs. I have a > solution, that (at least) has merit in both existing and > working. > Perhaps it > isn't ideal, but the Ganglia web front-end should provide at > least > some > standard hooks if not an actual framework. > > - Have the option to switch off displaying all the > single-metric > graphs. > If you have ~300 metrics, the little graphs at the page bottom > are > all but > useless. They slow down the loading of the page without > adding > much insight. > (I have a simple patch that allows a user to choose whether > they > want to see > these graphs.) > > - Fix the pie-chart-generating code. > The current pie-chart code is a bit ugly and can plot > things > incorrectly > under certain circumstances. There must be some nicer > graph > plotting > packages out there... > > > > > > > ------------------------------------------------------------------------- SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general