Hi Bernard,
 
The server has been rebooted as requested .  I have already tried stopping the 
gmetad server and killing off the rrdtool processes and then restarting the 
gmetad server.  The processes will start reaccumulating over a period of time. 
This server is not having any hardware issues according to it's logfiles. The 
version of rrdtool I am runningis: 1.2.19 and the ganglia build is: 3.0.4.  
 
Leo Albee
Systems Manager
Children's Hospital - Boston
Phone Number:  857-218-4131
Email:  [EMAIL PROTECTED]

________________________________

From: Bernard Li [mailto:[EMAIL PROTECTED]
Sent: Mon 12/10/2007 1:05 PM
To: Albee, Leo
Cc: [email protected]
Subject: Re: [Ganglia-general] Ganglia rrdtool problem?



Hi Leo:

On 12/10/07, Albee, Leo <[EMAIL PROTECTED]> wrote:

> I have ganglia (3.0.4)  setup  in the following configuration:
>
> 3 clusters each with a head cluster node
>     * cluster1 = 4 nodes
>     * cluster2 = 6 nodes
>     * cluster3 = 2 nodes
> 1 ganglia server/Web
>
> The configuration works fine and everything is working correctly. The problem 
> I have is the ganglia server seems to be cpu bound by rrdtool processes. When 
> I check the server each day it seems that new rrdtool processes have been 
> added to the existing long running processes. I've searched high and low and 
> don't have anything to go on.  Please see  "ps" output from the ganglia 
> server, take notice of the long running rrdtool processes. It seems excessive 
> for there to be that many processes for 12 nodes.  One other thing to make 
> note of is when I initially started the server there were only 6 rrdtool 
> processes and they were taking pratically all the cpu cycles. Any help would 
> be appreciated.
>
>    nobody 22219 21854   0   Nov 09 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194644945 --end 1194648545 --width 300
>   nobody  4954 25991   0 09:32:04 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194532305 --end 1195137105 --width 300
>   nobody 24563 23152   0 12:53:21 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195059194 --end 1195062794 --width 300
>   nobody  1092 22274   0   Nov 10 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194721069 --end 1194724669 --width 300
>   nobody 24565 24563   4 12:53:21 ?         122:18 /usr/bin/rrdtool graph - 
> --start 1195059194 --end 1195062794 --width 300 --heig
>   nobody 22071 22069   4   Nov 09 ?        1895:27 /usr/bin/rrdtool graph - 
> --start 1194644923 --end 1194648523 --width 300 --heig
>   nobody 25498  1252   0 14:43:20 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1192650175 --end 1195069344 --width 300
>   nobody  1588  1203   0   Nov 10 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194639101 --end 1194725501 --width 300
>   nobody  1667  1473   0   Nov 10 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194639130 --end 1194725530 --width 300
>   nobody 22145 21855   0   Nov 09 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194644945 --end 1194648545 --width 300
>   nobody  1169   931   0   Nov 10 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194721091 --end 1194724691 --width 300
>   nobody 25421 24735   0 14:43:09 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1192650175 --end 1195069344 --width 300
>   nobody   930   929   5   Nov 10 ?        1071:04 /usr/bin/rrdtool graph - 
> --start 1194638269 --end 1194724669 --width 300 --heig
>   nobody  1247  1246   4   Nov 10 ?        1149:24 /usr/bin/rrdtool graph - 
> --start 1194119913 --end 1194724713 --width 300 --heig
>   nobody  1589  1588   4   Nov 10 ?        1065:28 /usr/bin/rrdtool graph - 
> --start 1194639101 --end 1194725501 --width 300 --heig
>   nobody  1668  1667   4   Nov 10 ?        1061:06 /usr/bin/rrdtool graph - 
> --start 1194639130 --end 1194725530 --width 300 --heig
>   nobody 24797 24796   4 13:44:36 ?         117:50 /usr/bin/rrdtool graph - 
> --start 1195062256 --end 1195065856 --width 300 --heig
>   nobody  1170  1169   4   Nov 10 ?        1075:51 /usr/bin/rrdtool graph - 
> --start 1194721091 --end 1194724691 --width 300 --heig
>   nobody 22221 22219   5   Nov 09 ?        1894:45 /usr/bin/rrdtool graph - 
> --start 1194644945 --end 1194648545 --width 300 --heig
>   nobody  1246 22163   0   Nov 10 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194119913 --end 1194724713 --width 300
>   nobody  1858  1616   0   Nov 10 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194721979 --end 1194725579 --width 300
>   nobody 24796 23091   0 13:44:36 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195062256 --end 1195065856 --width 300
>   nobody 22146 22145   4   Nov 09 ?        1912:40 /usr/bin/rrdtool graph - 
> --start 1194644945 --end 1194648545 --width 300 --heig
>   nobody 25422 25421   4 14:43:09 ?         107:17 /usr/bin/rrdtool graph - 
> --start 1192650175 --end 1195069344 --width 300 --heig
>   nobody  1453 21857   0   Nov 10 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194120678 --end 1194725478 --width 300
>   nobody  4711  3085   0 09:30:43 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195133412 --end 1195137012 --width 300
>   nobody 22069 21877   0   Nov 09 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194644923 --end 1194648523 --width 300
>   nobody  4876  4875   4 09:31:11 ?           0:13 /usr/bin/rrdtool graph - 
> --start 1194532261 --end 1195137061 --width 300 --heig
>   nobody 25875 25874   4 15:12:23 ?         111:01 /usr/bin/rrdtool graph - 
> --start 1195067510 --end 1195071110 --width 300 --heig
>   nobody 24641  1683   0 12:53:55 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195059217 --end 1195062817 --width 300
>   nobody 24642 24641   4 12:53:55 ?         120:06 /usr/bin/rrdtool graph - 
> --start 1195059217 --end 1195062817 --width 300 --heig
>   nobody  4712  4711   4 09:30:43 ?           0:15 /usr/bin/rrdtool graph - 
> --start 1195133412 --end 1195137012 --width 300 --heig
>   nobody  1454  1453   4   Nov 10 ?        1105:56 /usr/bin/rrdtool graph - 
> --start 1194120678 --end 1194725478 --width 300 --heig
>   nobody  4788  4786   4 09:30:55 ?           0:14 /usr/bin/rrdtool graph - 
> --start 1195133440 --end 1195137040 --width 300 --heig
>   nobody 23067 23066   4   Nov 13 ?         240:42 /usr/bin/rrdtool graph - 
> --start 1195009526 --end 1195013126 --width 300 --heig
>   nobody   929 21856   0   Nov 10 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194638269 --end 1194724669 --width 300
>   nobody  4875 21858   0 09:31:11 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1194532261 --end 1195137061 --width 300
>   nobody  1093  1092   4   Nov 10 ?        1138:34 /usr/bin/rrdtool graph - 
> --start 1194721069 --end 1194724669 --width 300 --heig
>   nobody 26058 24815   0 16:17:29 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195071419 --end 1195075019 --width 300
>   nobody  4786  1130   0 09:30:55 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195133440 --end 1195137040 --width 300
>   nobody 26060 26058   4 16:17:29 ?         102:33 /usr/bin/rrdtool graph - 
> --start 1195071419 --end 1195075019 --width 300 --heig
>   nobody 23066  1870   0   Nov 13 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195009526 --end 1195013126 --width 300
>   nobody 25874 24576   0 15:12:23 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195067510 --end 1195071110 --width 300
>   nobody 23145 23144   4   Nov 13 ?         250:06 /usr/bin/rrdtool graph - 
> --start 1195009552 --end 1195013152 --width 300 --heig
>   nobody 23144 21875   0   Nov 13 ?           0:00 sh -c /usr/bin/rrdtool 
> graph - --start 1195009552 --end 1195013152 --width 300
>   nobody  4955  4954   4 09:32:04 ?           0:09 /usr/bin/rrdtool graph - 
> --start 1194532305 --end 1195137105 --width 300 --heig
>   nobody  1859  1858   4   Nov 10 ?        1117:21 /usr/bin/rrdtool graph - 
> --start 1194721979 --end 1194725579 --width 300 --heig
>   nobody 25499 25498   4 14:43:20 ?         108:40 /usr/bin/rrdtool graph - 
> --start 1192650175 --end 1195069344 --width 300 --heig

This is way off from normal behaviour.  I suggest you stop the gmetad
daemon, kill off all your rrdtool processes and start it up again
(wouldn't hurt to reboot the machine if possible too...).  Have you
also checked to see if you are having HD problems?

Which version of rrdtool are you using and what Linux
distribution/version/arch are you running gmetad/web frontend?

Cheers,

Bernard





-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to