Hi Michael,

thanks for the reply.

You are right, its mostly been disk IO issue. I have had some help on the
network side of gmond by running all of them in deaf mode but couple of the
nodes designating them as data source nodes as no-deaf node configuration.
I have looked at rrdcached and ran it for a while, initially it looked
promising. After adding more metrics, even rrdcached does not really seem
to help with disk IO problem.
I could think of having different data source uses different disks (rrd
location by using symlinks) to have the disk IO problem alleviated a little
bit but rrdcached seems to be using with only one disk but not multiple
locations using symlinks.
Are there any specific options with respect to rrdcached that you would
suggest to help ?

Thanks,
Nikhil


On Mon, Jul 8, 2013 at 9:04 AM, Michael Shearer <[email protected]>wrote:

> Hi Nikhil,
>
> Is the machine running gmetad exhibiting high wait on I/O? I have seen
> periodic blanks in graphs on servers where the disk I/O was too high
> writing the RRDs, and so not all of them got updated, leading to missing
> data in the graphs. If this is the case, you can look at rrdcached to
> decrease the disk I/O load.
>
> If it is specifically multicast related then I cannot help, I have only
> ever used ganglia in unicast configuration.
>
> Cheers, Michael.
>
>
> On 7 July 2013 02:10, Nikhil <[email protected]> wrote:
>
>> resending it, after getting added to the group.
>>
>> Hi,
>>>
>>> I have got ganglia enabled recently using multicast configuration for
>>> couple of clusters, one of them is large node clusters typically some good
>>> part of them are nodes having lots of metrics in the range of 600-800.
>>>
>>> After enabling only the default ganglia metrics, the graphs are good but
>>> once the custom system metrics are flowing using the gmond multicast
>>> configuration, graphs tend to be broken in the sense they are
>>> intermittently blank.
>>> Quite frequently, almost all of the hosts in the cluster are down and
>>> for a while they come up again.
>>>
>>> I am using default poll interval of 15s in gmetad.conf for a cluster
>>> data_source and they are more than 1 data sources configured for a cluster,
>>> although I see that only other data sources are used incase the first one
>>> is not reachable over the gmond port.
>>> BTW, I also have enabled the gmond buffer to 10M. I am not sure how do I
>>> calculate the exact buffer required for gmond with respect to the number of
>>> exhaustive metrics in the cluster for lots of nodes. If there is a
>>> relation, like combination of the data types being used for all the metrics
>>> on a node and the frequency intervals, then I will make up for it. Please
>>> do let me know in this regard.
>>>
>>> Earlier there was a problem with the web interface, error 400 crept in.
>>> After looking into apache error logs, it was found to be with the memory
>>> setting of php. So, I increased good amounts for php memory limits for
>>> ganglia in apache configuration, so that seems to be okay. But the graphs
>>> being intermittently blank (quite frequently) and hosts showing as down in
>>> the cluster view is little irritating.
>>>
>>> I am wondering if there are any settings that are to be considered for
>>> optimal usage of the ganglia in large clusters using multicast
>>> configuration.
>>>
>>> Thanks,
>>> Nikhil
>>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> _______________________________________________
>> Ganglia-developers mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/ganglia-developers
>>
>>
>
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to