Re: [Ganglia-general] Fwd: [Beowulf] Performance metrics reporting

2008-04-11 Thread Witham, Timothy D
So I'd like to ask the Ganglia community -- do you guys find Ganglia to be a resource hog? No. But once I had a couple hundred gmetad processes on a 2GB server. When the size of active processes and RRD files in tmpfs exceeds physical memory, the server begins swapping and can't keep up with the

Re: [Ganglia-general] Fwd: [Beowulf] Performance metrics reporting

2008-04-11 Thread Brad Nicholes
On 4/11/2008 at 1:53 PM, in message [EMAIL PROTECTED], Witham, Timothy D [EMAIL PROTECTED] wrote: So I'd like to ask the Ganglia community -- do you guys find Ganglia to be a resource hog? No. But once I had a couple hundred gmetad processes on a 2GB server. When the size of active

Re: [Ganglia-general] Fwd: [Beowulf] Performance metrics reporting

2008-04-11 Thread Bernard Li
Hi Brad: On Fri, Apr 11, 2008 at 3:04 PM, Brad Nicholes [EMAIL PROTECTED] wrote: I agree that the size of the XML could be reduced in most cases, however it would be impractical to define the metrics in gmeta. The reason why is because of the new metric pluggable modules in 3.1. Since

Re: [Ganglia-general] Fwd: [Beowulf] Performance metrics reporting

2008-04-11 Thread Brad Nicholes
On 4/11/2008 at 4:09 PM, in message [EMAIL PROTECTED], Bernard Li [EMAIL PROTECTED] wrote: Hi Brad: On Fri, Apr 11, 2008 at 3:04 PM, Brad Nicholes [EMAIL PROTECTED] wrote: I agree that the size of the XML could be reduced in most cases, however it would be impractical to define the

Re: [Ganglia-general] Fwd: [Beowulf] Performance metrics reporting

2008-04-09 Thread aurbain
I've got 900 hosts across a dozen clusters, gmetad tmpfs rrd set = 350meg. CPU 1 min load rarely above 1. would be fine for gmetad to setup tmpfs and manage dataset backups for me. Bernard Li wrote: I never found gmond to be resource intensive. Sure there has been memory leaks (that have

[Ganglia-general] Fwd: [Beowulf] Performance metrics reporting

2008-04-08 Thread Bernard Li
I never found gmond to be resource intensive. Sure there has been memory leaks (that have since been fixed) but I doubt this is what Donald is referring to. Perhaps he is talking about gmetad. Most users know that if they have a lot of hosts (1000+) they are monitoring, they need to do