>>> On 11/7/2009 at 12:06 AM, in message <20091107070643.ga20...@porcupine.cita.utoronto.ca>, Robin Humble <robin.humble+gang...@anu.edu.au> wrote: > Hi, > > I spoof a bunch of temperature and power metrics via ILOM for a few > hundred nodes and I noticed that gmetad wasn't making a summary table > (.../__SummaryInfo__/*) for most of the spoof'd values. > > turns out that there's a SPOOF_HOST EXTRA_ELEMENT attached to each > spoof'd metric, and when 100's of hosts (>40 or so should trigger it) > have spoof'd entries, then those add up and then corrupt the summary > Metric structure enough to destroy the .type and stop the rrd being > generated. > I'm guessing it's the same as the MAX_EXTRA_ELEMENTS problem, except > for the summary table instead of the host table. > > attached is a simplistic patch that fixes the problem. > it could probably be done better, but works for me. it's against 3.1.2, > but should apply to 3.1.4 as well. > > apologies if I have some of the ganglia/gmetad terminology wrong - I've > been using it for years, but this my first dive into the code. >
I took a look at this patch and since I am not able to reproduce the problem, it makes it a little unclear as to what is happening. I can't really figure out how this patch fixes a problem with the hash table. According to the source code, whenever an extra element is parsed, the code inserts the extra element into a list of extra data on a per metric basis. This means that only one extra element for a spoof host is ever stored for a metric. Then when the code moves into the summary data portion, it specifically checks to make sure that it is not duplicating an extra element value before it inserts it into the summary node (check the for loop at around line #827 in the 3.1.2 version of the source code). If it detects a duplicate value, then it skips the insert and just updates the rest of the summary node in the hash table. Since I am not able to duplicate the problem, could you step further through the original source code to make sure that the check for a duplicate value is actually happening and that the code is not taking some other path that could be causing the problem. You might also want to check in the source code at the point where the summary table is actually written to see if there is some clue there why your summary rrd files are not being created or updated. Brad ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers