Oh -- forgot to mention the easiest way to create this problem:
cut-and-paste in gmetad.conf, where you change the data source name,
walk away, and don't change the gmonds you're polling. Or you dup a
line thinking you can change the cluster name in gmetad.conf, which
you can't, but leave the original one there.

-- ReC

On Fri, Apr 8, 2011 at 11:32 AM, Rick Cobb <rick_c...@ieee.org> wrote:
> You can also get this behavior by having two gmetad data sources where
> the gmond.conf's have the same cluster name.
>
> E.g., if you have data source A and data source B, but the hosts in
> those data sources think they're in cluster 'Uck', you'll get this
> every time you try to update summaries, because the filename
> organization is based on the cluster names, not the data sources.
> You'll also only see one cluster in the web view.
>
> In that case, gmetad really is creating bogus summary info rrdupdates,
> since it's acting like there's only one cluster (the file name and all
> the presentation logic is based on the cluster name reported by
> gmond), but there are two threads trying to summarize data (and
> they're not agreeing, either, but that's just another symptom of this
> misconfiguration).
>
> Note that it only takes two hosts having this wrong -- as long as
> they're the ones aggregating the data and being sampled by gmetad --
> to cause the problem.
>
> I've done this intentionally (because of multicast routing problems),
> but it was a complete cluster Uck :/
>
> Don't do it --
> -- ReC
>
> On Fri, Apr 8, 2011 at 11:19 AM, Jesse Becker <haw...@gmail.com> wrote:
>> I've seen this on my servers, and I know that the clocks are all
>> correct for them.  That said, it's usually transient.  It may also be
>> caused by server load as well.  For example, if the host running
>> gmetad is very busy, it may not get around to running several updates
>> for a while, then try to do them "all at once."  (at least, that's my
>> theory, I don't have proof though).
>>
>> On Fri, Apr 8, 2011 at 14:12, Bernard Li <bern...@vanhpc.org> wrote:
>>> Hi Mason:
>>>
>>> Are all your servers' time synced via ntpd?  Sounds more like there is
>>> time mismatch between your servers to me.
>>>
>>> Cheers,
>>>
>>> Bernard
>>>
>>> On Fri, Apr 8, 2011 at 11:07 AM, Mason Hsiung <yangwing...@gmail.com> wrote:
>>>> Hi,
>>>>
>>>> I'm using ganglia 3.1.7 and currently encountering RRD_update error.
>>>> --
>>>> Apr  8 18:03:43 my_host /usr/sbin/gmetad[9553]: RRD_update
>>>> (/var/lib/ganglia/rrds/ABC/__SummaryInfo__/xxxxxx.rrd):
>>>> /var/lib/ganglia/rrds/ABC/__SummaryInfo__/xxxxxx.rrd: illegal attempt
>>>> to update using time 1302285820 when last update time is 1302285821
>>>> (minimum one second step)
>>>> --
>>>> So I google for a while, then I found there is a guy who did a little
>>>> modification in the source code to solve this error.
>>>>
>>>> Add "break;" statement into "process_xml.c", It looks like below.
>>>> --
>>>> static void
>>>> end (void *data, const char *el)
>>>> {
>>>>    struct xml_tag *xt;
>>>>    int rc;
>>>>
>>>>    if(! (xt = in_xml_list((char*) el, strlen(el))) )
>>>>       return;
>>>>
>>>>    switch ( xt->tag )
>>>>       {
>>>>          case GRID_TAG:
>>>>             rc = endElement_GRID(data, el);
>>>>             break;   <-- this one
>>>>             /* No break. */
>>>>
>>>>          case CLUSTER_TAG:
>>>>             rc = endElement_CLUSTER(data, el);
>>>>             break;
>>>>
>>>>          default:
>>>>                break;
>>>>       }
>>>>    return;
>>>> }
>>>> --
>>>>
>>>> I would like to know, is "/* No break */" a warning message that tells
>>>> us *don't put break here* or just a comment to remind someone "hey you
>>>> forgot to put break here"?
>>>>
>>>> I would like to try this, if someone could tell me this is a correct
>>>> change and won't be any impact to the functionalities.
>>>>
>>>> Thanks a lot :)
>>>>
>>>> regards,
>>>> Mason Hsiung
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Xperia(TM) PLAY
>>>> It's a major breakthrough. An authentic gaming
>>>> smartphone on the nation's most reliable network.
>>>> And it wants your games.
>>>> http://p.sf.net/sfu/verizon-sfdev
>>>> _______________________________________________
>>>> Ganglia-general mailing list
>>>> Ganglia-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Xperia(TM) PLAY
>>> It's a major breakthrough. An authentic gaming
>>> smartphone on the nation's most reliable network.
>>> And it wants your games.
>>> http://p.sf.net/sfu/verizon-sfdev
>>> _______________________________________________
>>> Ganglia-general mailing list
>>> Ganglia-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>>>
>>
>>
>>
>> --
>> Jesse Becker
>>
>> ------------------------------------------------------------------------------
>> Xperia(TM) PLAY
>> It's a major breakthrough. An authentic gaming
>> smartphone on the nation's most reliable network.
>> And it wants your games.
>> http://p.sf.net/sfu/verizon-sfdev
>> _______________________________________________
>> Ganglia-general mailing list
>> Ganglia-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>>
>

------------------------------------------------------------------------------
Xperia(TM) PLAY
It's a major breakthrough. An authentic gaming
smartphone on the nation's most reliable network.
And it wants your games.
http://p.sf.net/sfu/verizon-sfdev
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to