Re: [Ganglia-general] grouping aixdisk metrics by VolumeGroup in aixdisk.conf?

Derek Smith Thu, 13 Jun 2013 08:53:58 -0700

Hi All!

I wrote a Perl script that dynamically updates aixdisk.conf based on lspv.
Most hosts here use XIV or HTC SAN arrays and a few use EMC powerpath.
On hosts that have upwards of 12,15,20 hdisk/hdiskpower devices and thus many 
lines in the conf,
is there a way I can add a group the this conf, such as group = DISK_VG_cachevg
so when I browse to the host metric page under aixdisk_metrics I see these 
subgroups and my ganglia server is
not overwhelmed with displaying all these disk metrics?


metric {
        name = "hdiskpower6_size"
        title = "hdiskpower6 cachevg total disk size"
        value_threshold = 1.0
        group = DISK_VG_cachevg
}
metric {
        name = "hdiskpower6_free"
        title = "hdiskpower6 cachevg free disk size"
        value_threshold = 1.0
        group = DISK_VG_cachevg
}


# wc -l conf.d/aixdisk.conf
    4811 conf.d/aixdisk.conf

If anyone wants to see my Perl code, let me know and I'd be glad to show you.
Thx!

-----Original Message-----
From: ganglia-general-requ...@lists.sourceforge.net 
[mailto:ganglia-general-requ...@lists.sourceforge.net] 
Sent: Wednesday, June 12, 2013 06:26 PM
To: ganglia-general@lists.sourceforge.net
Subject: Ganglia-general Digest, Vol 85, Issue 3

Send Ganglia-general mailing list submissions to
        ganglia-general@lists.sourceforge.net

To subscribe or unsubscribe via the World Wide Web, visit
        
https://urldefense.proofpoint.com/v1/url?u=https://lists.sourceforge.net/lists/listinfo/ganglia-general&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=mBOA8eV7xAaaI8FKgdMR8jiZEOP1zkvAermInOxSeCM%3D%0A&s=3ba0519826b0259b48b3bb5c342c7af227b5c3cfe5afa84451c8a7443d7cc6da
or, via email, send a message with subject or body 'help' to
        ganglia-general-requ...@lists.sourceforge.net

You can reach the person managing the list at
        ganglia-general-ow...@lists.sourceforge.net

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Ganglia-general digest..."


Today's Topics:

   1. Re: Scalability issue (Christophe HAEN)
   2. Ganglia in a tree node architecture (Jeff Ramsey)
   3. Re: Ganglia in a tree node architecture (Michael Shearer)


----------------------------------------------------------------------

Message: 1
Date: Tue, 11 Jun 2013 09:56:00 +0200
From: Christophe HAEN <christophe.h...@cern.ch>
Subject: Re: [Ganglia-general] Scalability issue
To: Sergio Ballestrero <sergio.ballestr...@cern.ch>
Cc: ganglia-general@lists.sourceforge.net
Message-ID:
        <cam8rxmxr0izduwyak+he1bhm-9htf1a-onb3rmhto1sfc77...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Dear all,

thanks for all those answers. I indeed think that the storage latency might 
explain why I cannot manage to have several gmetad running.
I tried applying Sergio's recommendations of having multiple gmond on the "top" 
machine, and this shows to be somewhat working.
I will however keep sflow in my mind just in case.

Once again, thanks for the support!
Chris


2013/6/9 Sergio Ballestrero <sergio.ballestr...@cern.ch>

> sorry, CPU info from the wrong server  - that was our secondary, a 
> slower system that does not run icinga integration, and it is 
> struggling (iowait) under the load of 1950 clients.
> the CPU load on the main is ~32% (28% user, 4% system, 0.3% iowait)
>
> Cheers, Sergio
>
> On 9 Jun 2013, at 10:26, Sergio Ballestrero wrote:
>
>  Hello Christophe,
> (since we've spoken before) as you know we're also using Ganglia. 
> We've recently added all HLT nodes to the monitoring, so we're up to 
> 1949 nodes monitored now by a single server - and on the same server 
> we also run Icinga (although with a "light" config) We have one server  
> [ 16GB, 2x Xeon E5620 2.4GHz (16HT cores), 2x250GB SATA in RAID0, 
> single 1Geth, running SLC5 ] and it's reasonably capable of handling 
> the load - CPU usage is at 21% (11% sys, 7% user, 3% iowait). And at 
> the moment a lot of the load is from the Icinga integration - its CPU 
> usage has grown far more than linearly with the number of nodes known 
> to gmetad.
> We run a single gmetad with 11 collector gmond (listen-only) and 1 
> "client" gmond (send-only) to monitor the host itself, all using unicast.
> We are running Ganglia 3.2.0 , not the 3.0.7 available from Epel - if 
> you want I have SLC5 and SLC6 packages ready.
> For starters, have a look at what ganglia says about the host itself, 
> if it's high on CPU user or iowait.
> I am of course happy to share configs - even the whole puppet module 
> if you want.
>
> Ciao,
>   Sergio
>
> On 8 Jun 2013, at 09:38, Christophe HAEN wrote:
>
> Hello everybody,
>
> I am trying to deploy Ganglia at a large scale, at the limit (and 
> probably higher soon...) of the 2000 nodes claimed on the web site. 
> And I must admit that so far, I miserably fail :-)
>
> To schematize, my environment is composed of 60 batches of more or 
> less 20 nodes.
> In each batch, one special node is chosen to be the aggregating gmond 
> : I do not use multicast within a cluster on purpose.
> The problem comes with the aggregation. I tried 2 approaches :
> - one gmetad for 10 clusters, and one "super" gmetad aggregating the 6 
> gmetads
> - a single gmetad to aggregate all the cluster, which I would prefer
>
> The first approach is kind of working (not perfectly though) but it 
> has a huge drawback : I would need in total 7 machines to run the 7 
> gmetads (because if I am not mistaken, running several gmetad on one 
> machine is not a foreseen use case, and thus not possible without being 
> sneaky).
> Virtualization comes into the game but for several reasons, it was not 
> possible to run 7 "gmetad  virtual machines".
>
> The second approach clearly does not go at all.
>
> In fact, I think that the data collection run smoothly, because I can 
> see the rrd files growing. However, those data are unusable because 
> the web page hangs forever!!
>
> I try to add rrdcached in between, but it is not sufficient. (Though 
> it helped and was needed  when I tried to aggregate 10 clusters per gmetad).
> I also try to reduce the polling time to 300 seconds (!!!), but no 
> more luck (as I said, I think the data collection is fine)
>
> So my question is the following : how do I scale up to the 2000 nodes? 
> :-)
>
> I've read this (old but still interesting) paper :
> https://urldefense.proofpoint.com/v1/url?u=http://www.ittc.ku.edu/~nie
> haus/classes/750-s07/documents/ganglia-parallel-computing.pdf&k=j2AJn6
> IkQ79ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=mBOA8eV7xAaaI8FKgdMR8jiZEOP1zkvAermInOxSeCM%3D%0A&s=df1fc39d67006da4d5a81b4be47ed458b9a48b178e1f8eeb9495f86745bd8c13
>  and I was a bit disappointed not to get more detailed information about the 
> SUNY setup.
>
> Btw, just to mention that the machine on which I want to run the 
> central gmetad should be good enough : 8 real CPUs, 16G of memory and 
> 2 10Gb ethernet interfaces used in a bonding configuration.
>
> So would someone be able to help me please?
>
> In advance, many thanks!
>
> Chris
> ----------------------------------------------------------------------
> -------- How ServiceNow helps IT people transform IT departments:
> 1. A cloud service to automate IT design, transition and operations 2. 
> Dashboards that offer high-level views of enterprise services 3. A 
> single system of record for all IT processes
>
> https://urldefense.proofpoint.com/v1/url?u=http://p.sf.net/sfu/service
> now-d2d-j_______________________________________________&k=j2AJn6IkQ79
> ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%
> 0A&m=mBOA8eV7xAaaI8FKgdMR8jiZEOP1zkvAermInOxSeCM%3D%0A&s=39d9cd3a7d029
> 06037f22c51bc70c8872d3634b3fb908d2e2deafa185fcd345b
> Ganglia-general mailing list
> Ganglia-general@lists.sourceforge.net
> https://urldefense.proofpoint.com/v1/url?u=https://lists.sourceforge.n
> et/lists/listinfo/ganglia-general&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=
> r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=mBOA8eV7xAaaI8FKgd
> MR8jiZEOP1zkvAermInOxSeCM%3D%0A&s=3ba0519826b0259b48b3bb5c342c7af227b5
> c3cfe5afa84451c8a7443d7cc6da
>
>
>
> --
>  Sergio Ballestrero  - 
> https://urldefense.proofpoint.com/v1/url?u=http://physics.uj.ac.za/psi
> wiki/Ballestrero&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8f
> a50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=mBOA8eV7xAaaI8FKgdMR8jiZEOP1zkvAerm
> InOxSeCM%3D%0A&s=0fab6e4aa5c81c333ba95a270e28e419bfb332cfea7bce1100937
> dff9a48b531  University of Johannesburg, Physics Department  ATLAS 
> TDAQ sysadmin team - Office:75282 OnCall:164851
>
>
>
>
>
>
>


--
Christophe HAEN
CERN PH-LBC 2/R022
Phone : +41 (0)2 27 67 31 25
Mobile : +41 (0)7 64 87 88 57
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 2
Date: Wed, 12 Jun 2013 17:10:37 -0400
From: Jeff Ramsey <jramsey...@gmail.com>
Subject: [Ganglia-general] Ganglia in a tree node architecture
To: ganglia-general@lists.sourceforge.net
Message-ID:
        <cagyfjswmxkoq2amdzznx4ouh8sbv-gszt2muqhwezqlrqnx...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

I've Ganglia up and running on a cluster that has somewhat of a tree 
architecture; each node contains up to eight custom PCIe adapters accessible 
through the PCIe bus.  Each PCIe adapter has 4 processors, each of which can 
have certain sensor measurements taken, like temperature.  I'm forwarding 
metrics data using a host-based daemon which invokes gmetric for each packet of 
metrics data received from each of the eight adapters (each adapter has a 
sensor daemon running that periodically send all sensor data to the host node 
in a packet).  Is there a way to structure the configuration so that Ganglia 
knows about each adapter as though they are all independent nodes?  Right now I 
have the metrics name as adapter-metrics being sent by gmetric (e.g. 
0-temperature is the temperature for the first adapter, and 7-temperature is 
the temperature for the eighth).  The metrics appear in the web app as, e.g., 
0-temperature.
I'd rather have the web app differentiate between the metrics for individual 
adapters, and let me customize graphs so a graph can be used for each adapter.  
Has anyone used Ganglia is this way, and what's the best way to configure it 
(if that's possible)?  Does any of the php need to be changed to do this?  
Thanks.

Jeff
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 3
Date: Thu, 13 Jun 2013 10:26:13 +1200
From: Michael Shearer <zonked.mon...@gmail.com>
Subject: Re: [Ganglia-general] Ganglia in a tree node architecture
To: Jeff Ramsey <jramsey...@gmail.com>
Cc: "ganglia-general@lists.sourceforge.net"
        <ganglia-general@lists.sourceforge.net>
Message-ID:
        <CAJzLuxR=X_4J8pDCZQF7GbgiQ0GSa_maTY=cnbbuhfmpocu...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Jeff,

You should be able to spoof the device name using gmetric -S from memory.
Hopefully that will solve your problem.

Cheers, Michael.


On 13 June 2013 09:10, Jeff Ramsey <jramsey...@gmail.com> wrote:

> I've Ganglia up and running on a cluster that has somewhat of a tree 
> architecture; each node contains up to eight custom PCIe adapters 
> accessible through the PCIe bus.  Each PCIe adapter has 4 processors, 
> each of which can have certain sensor measurements taken, like 
> temperature.  I'm forwarding metrics data using a host-based daemon 
> which invokes gmetric for each packet of metrics data received from 
> each of the eight adapters (each adapter has a sensor daemon running 
> that periodically send all sensor data to the host node in a packet).  
> Is there a way to structure the configuration so that Ganglia knows 
> about each adapter as though they are all independent nodes?  Right 
> now I have the metrics name as adapter-metrics being sent by gmetric 
> (e.g. 0-temperature is the temperature for the first adapter, and 
> 7-temperature is the temperature for the eighth).  The metrics appear in the 
> web app as, e.g., 0-temperature.
> I'd rather have the web app differentiate between the metrics for 
> individual adapters, and let me customize graphs so a graph can be 
> used for each adapter.  Has anyone used Ganglia is this way, and 
> what's the best way to configure it (if that's possible)?  Does any of 
> the php need to be changed to do this?  Thanks.
>
> Jeff
>
>
> ----------------------------------------------------------------------
> -------- This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> https://urldefense.proofpoint.com/v1/url?u=http://p.sf.net/sfu/windows
> -dev2dev&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8fa50N7BUs
> hlePBUb7tm6tw5oE5c%3D%0A&m=mBOA8eV7xAaaI8FKgdMR8jiZEOP1zkvAermInOxSeCM
> %3D%0A&s=ffe2616b02034e61d739a8bf6e36c12a2fcdde9499da014a0a731b2d1163d
> 5f4 _______________________________________________
> Ganglia-general mailing list
> Ganglia-general@lists.sourceforge.net
> https://urldefense.proofpoint.com/v1/url?u=https://lists.sourceforge.n
> et/lists/listinfo/ganglia-general&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=
> r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=mBOA8eV7xAaaI8FKgd
> MR8jiZEOP1zkvAermInOxSeCM%3D%0A&s=3ba0519826b0259b48b3bb5c342c7af227b5
> c3cfe5afa84451c8a7443d7cc6da
>
>
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

https://urldefense.proofpoint.com/v1/url?u=http://p.sf.net/sfu/windows-dev2dev&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=mBOA8eV7xAaaI8FKgdMR8jiZEOP1zkvAermInOxSeCM%3D%0A&s=ffe2616b02034e61d739a8bf6e36c12a2fcdde9499da014a0a731b2d1163d5f4

------------------------------

_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://urldefense.proofpoint.com/v1/url?u=https://lists.sourceforge.net/lists/listinfo/ganglia-general&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=mBOA8eV7xAaaI8FKgdMR8jiZEOP1zkvAermInOxSeCM%3D%0A&s=3ba0519826b0259b48b3bb5c342c7af227b5c3cfe5afa84451c8a7443d7cc6da


End of Ganglia-general Digest, Vol 85, Issue 3
**********************************************

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Re: [Ganglia-general] grouping aixdisk metrics by VolumeGroup in aixdisk.conf?

Reply via email to