One more time trying to get this thread back on the list instead of just
between David & me.

And I'll disclaim expertise on 3.1 here. On the other hand, I've been in
this code more times than I wanted to in 3.0, and I don't think the
fundamental design of gmetad was affected by 3.0 -> 3.1.

I'm not claiming the behavior is consistent, of course -- but really, the
fact that this looks like it kind-of-works is the core bug.

Gmetad thinks 1 datasource == 1 cluster.  In particular, it runs one thread
per datasource, and that thread maintains the cluster summary metrics. You
can construct things so the front-end & cluster directories pretend that 3
datasources == 1 cluster, but it doesn't work, and that's what you're
running into.  I've done it intentionally, actually, and simply ignored the
fact that summaries were wrong for that cluster, but I don't recommend it.
In particular, you're probably getting a ton of 'rrd_update' errors for
summary_info RRDfiles in your syslog.

To get consistent behavior, your options are:
* Treat this as one *grid* of 3 clusters.  Grid summaries will work, but the
meta view isn't nearly as functional as the cluster view, so it's not really
optimal for what you want.
* Get these to come in as one datasource.  Since you have separate multicast
domains, you may have to resort to unicast to do this.

Sorry to be the bearer of bad news, but it would take a fairly nasty bit of
gmetad hacking to fix this -- and it would deeply affect scalability, since
the single-thread-per-datasource solution removes a lot of opportunities for
lock contention.

-- ReC
- Show quoted text -



On Fri, Oct 29, 2010 at 9:52 AM, David B Ritch <[email protected]>wrote:

> Thanks, Rick.  Unfortunately, that doesn't seem to be the problem I'm
> running into.  I do have the cluster name set to Datanodes in all the
> client.  Otherwise, I wouldn't expect it to show all of them when I click
> Show Hosts.
>
> dbr
>
> Rick Cobb wrote:
>
>> This is such a common misconception that the development team should
>> consider removing the name field from the data_source configuration line
>> entirely.
>>
>> Fundamentally, cluster names come from the gmond.conf files.  The names of
>> datasources exist only to confuse the hell out of you and create bugs.  You
>> need to change those gmond.conf's to match the cluster names you want.
>> IIRC, it's a good idea for the datasources lines to match those because
>> they actually are used in a few places and having them *not* match just
>> confuses the next guy who maintains your system.
>>
>> -- ReC
>>
>>
>> On Fri, Oct 29, 2010 at 6:15 AM, David B. Ritch 
>> <[email protected]<mailto:
>> [email protected]>> wrote:
>>
>>    I'm running Ganglia-3.1.7 under RHEL-5.5 on a cluster.  My nodes are
>>    divided into different classes for monitoring.   My largest class of
>>    nodes, datanodes, spans 3 VLANs, and I don't route multicast between
>>    those domains.  I have the following in gmetad.conf on my master node:
>>
>>    data_source "Datanodes"  r01n40-ge:8649 r03n40-ge:8649 r05n40-ge:8649
>>    data_source "Datanodes2" r11n40-ge:8649 r13n40-ge:8649 r15n40-ge:8649
>>    data_source "Datanodes3" r21n40-ge:8649 r23n40-ge:8649 r25n40-ge:8649
>>
>>    Each datanode has "Datanodes" specified as its cluster name.
>>
>>    When I look at the web interface, at the grid level, the summary of my
>>    Datanodes only shows 1/3 of my datanodes.  When I select the Datanodes
>>    cluster (Grid > Datanodes), and select Show Hosts: no, I see the same
>>    graph and the same number of nodes.  However, when I select Show
>>    Hosts:
>>    yes, The Hosts up: and CPUs Total both jump up to the proper totals.
>>
>>    Apparently, gmetad sees all the nodes and puts them in the right
>>    cluster, but doesn't calculate the summaries properly.
>>
>>    Am I doing something wrong, or is the a problem in Ganglia?
>>
>>    Thanks!
>>
>>    David
>>
>>
>>  
>> ------------------------------------------------------------------------------
>>    Nokia and AT&T present the 2010 Calling All Innovators-North
>>    America contest
>>    Create new apps & games for the Nokia N8 for consumers in  U.S.
>>    and Canada
>>    $10 million total in prizes - $4M cash, 500 devices, nearly $6M in
>>    marketing
>>    Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi
>>    Store
>>    http://p.sf.net/sfu/nokia-dev2dev
>>    _______________________________________________
>>    Ganglia-general mailing list
>>    [email protected]
>>    <mailto:[email protected]>
>>
>>    https://lists.sourceforge.net/lists/listinfo/ganglia-general
>>
>>
>>
>
------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to