Oh -- forgot to mention the easiest way to create this problem: cut-and-paste in gmetad.conf, where you change the data source name, walk away, and don't change the gmonds you're polling. Or you dup a line thinking you can change the cluster name in gmetad.conf, which you can't, but leave the original one there.
-- ReC On Fri, Apr 8, 2011 at 11:32 AM, Rick Cobb <rick_c...@ieee.org> wrote: > You can also get this behavior by having two gmetad data sources where > the gmond.conf's have the same cluster name. > > E.g., if you have data source A and data source B, but the hosts in > those data sources think they're in cluster 'Uck', you'll get this > every time you try to update summaries, because the filename > organization is based on the cluster names, not the data sources. > You'll also only see one cluster in the web view. > > In that case, gmetad really is creating bogus summary info rrdupdates, > since it's acting like there's only one cluster (the file name and all > the presentation logic is based on the cluster name reported by > gmond), but there are two threads trying to summarize data (and > they're not agreeing, either, but that's just another symptom of this > misconfiguration). > > Note that it only takes two hosts having this wrong -- as long as > they're the ones aggregating the data and being sampled by gmetad -- > to cause the problem. > > I've done this intentionally (because of multicast routing problems), > but it was a complete cluster Uck :/ > > Don't do it -- > -- ReC > > On Fri, Apr 8, 2011 at 11:19 AM, Jesse Becker <haw...@gmail.com> wrote: >> I've seen this on my servers, and I know that the clocks are all >> correct for them. That said, it's usually transient. It may also be >> caused by server load as well. For example, if the host running >> gmetad is very busy, it may not get around to running several updates >> for a while, then try to do them "all at once." (at least, that's my >> theory, I don't have proof though). >> >> On Fri, Apr 8, 2011 at 14:12, Bernard Li <bern...@vanhpc.org> wrote: >>> Hi Mason: >>> >>> Are all your servers' time synced via ntpd? Sounds more like there is >>> time mismatch between your servers to me. >>> >>> Cheers, >>> >>> Bernard >>> >>> On Fri, Apr 8, 2011 at 11:07 AM, Mason Hsiung <yangwing...@gmail.com> wrote: >>>> Hi, >>>> >>>> I'm using ganglia 3.1.7 and currently encountering RRD_update error. >>>> -- >>>> Apr 8 18:03:43 my_host /usr/sbin/gmetad[9553]: RRD_update >>>> (/var/lib/ganglia/rrds/ABC/__SummaryInfo__/xxxxxx.rrd): >>>> /var/lib/ganglia/rrds/ABC/__SummaryInfo__/xxxxxx.rrd: illegal attempt >>>> to update using time 1302285820 when last update time is 1302285821 >>>> (minimum one second step) >>>> -- >>>> So I google for a while, then I found there is a guy who did a little >>>> modification in the source code to solve this error. >>>> >>>> Add "break;" statement into "process_xml.c", It looks like below. >>>> -- >>>> static void >>>> end (void *data, const char *el) >>>> { >>>> struct xml_tag *xt; >>>> int rc; >>>> >>>> if(! (xt = in_xml_list((char*) el, strlen(el))) ) >>>> return; >>>> >>>> switch ( xt->tag ) >>>> { >>>> case GRID_TAG: >>>> rc = endElement_GRID(data, el); >>>> break; <-- this one >>>> /* No break. */ >>>> >>>> case CLUSTER_TAG: >>>> rc = endElement_CLUSTER(data, el); >>>> break; >>>> >>>> default: >>>> break; >>>> } >>>> return; >>>> } >>>> -- >>>> >>>> I would like to know, is "/* No break */" a warning message that tells >>>> us *don't put break here* or just a comment to remind someone "hey you >>>> forgot to put break here"? >>>> >>>> I would like to try this, if someone could tell me this is a correct >>>> change and won't be any impact to the functionalities. >>>> >>>> Thanks a lot :) >>>> >>>> regards, >>>> Mason Hsiung >>>> >>>> ------------------------------------------------------------------------------ >>>> Xperia(TM) PLAY >>>> It's a major breakthrough. An authentic gaming >>>> smartphone on the nation's most reliable network. >>>> And it wants your games. >>>> http://p.sf.net/sfu/verizon-sfdev >>>> _______________________________________________ >>>> Ganglia-general mailing list >>>> Ganglia-general@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/ganglia-general >>>> >>> >>> ------------------------------------------------------------------------------ >>> Xperia(TM) PLAY >>> It's a major breakthrough. An authentic gaming >>> smartphone on the nation's most reliable network. >>> And it wants your games. >>> http://p.sf.net/sfu/verizon-sfdev >>> _______________________________________________ >>> Ganglia-general mailing list >>> Ganglia-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/ganglia-general >>> >> >> >> >> -- >> Jesse Becker >> >> ------------------------------------------------------------------------------ >> Xperia(TM) PLAY >> It's a major breakthrough. An authentic gaming >> smartphone on the nation's most reliable network. >> And it wants your games. >> http://p.sf.net/sfu/verizon-sfdev >> _______________________________________________ >> Ganglia-general mailing list >> Ganglia-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/ganglia-general >> > ------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general