[Ganglia-general] Ganglia alert
Dear All, We implemented Ganglia-3.7.2 and will look for below features. kindly advise how to achieve this. 1. Ganglia alert over email on events of cpu,disk or memory util goes above 75% on any compute node at our cluster. How to configure it? 2. Ganglia averaging load includes our login nodes and file system nodes which are not part of computing. how to configure only list cpu util of compute nodes? 3. How to have graphs with specific time lines for a individual nodes? -- Regards, Anilkumar Naik 022-2278 2342 ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia alert
On 13/04/2020 10:42, Valerio Bellizzomi wrote: > My bad, correction is: PIDFile= instead of PIDfile= > > that's it. We all get stuck on things like that from time to time. Thanks for sharing your feedback about this Feel free to contribute your systemd unit file as a pull request Regards, Daniel ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia alert
My bad, correction is: PIDFile= instead of PIDfile= that's it. On Mon, 2020-04-13 at 10:38 +0200, Valerio Bellizzomi wrote: > The only problem with this is that the ganglia-alert script exits > immediately, thus systemd keeps restarting it and the PID is increasing > continuously. > > > > > On Mon, 2020-04-13 at 10:35 +0200, Valerio Bellizzomi wrote: > > I have adapted the ganglia-alert init script to systemd for Debian: > > > > > > > > [Unit] > > Description=Ganglia Alert Service > > > > [Service] > > Type=simple > > PIDfile=/var/run/ganglia-alert.pid > > ExecStart=/home/user/ganglia-alert -d -c /home/user/ganglia_config.txt > > Restart=always > > > > [Install] > > WantedBy=multi-user.target > > Alias=ganglia-alert.service > > > > > > > > > > > > > > > > > > > > > > On Wed, 2020-03-25 at 11:58 -0400, Vladimir Vuksan wrote: > > > Hi Valerio, > > > > > > Unfortunately last couple weeks have been pretty rough for many people. > > > I don't really have much input on the init script. You are probably best > > > off trying to adapt the init script to systemd on your own. > > > > > > Sincerely, > > > > > > Vladimir > > > > > > > > > 3/25/20 u 5:50 AM, Valerio Bellizzomi je napisao/la: > > > > Is it normal to not get a reply for so long time on this list? > > > > > > > > > > > > On Fri, 2020-03-13 at 05:50 +0100, Valerio Bellizzomi wrote: > > > >> Greetings, > > > >> I would like to setup ganglia-alert on Debian, but the init script is > > > >> very old and does not fit with the new systemd setup. > > > >> > > > >> suggestions? > > > >> > > > >> > > > >> > > > > > > > > > > > > > > > > ___ > > > > Ganglia-general mailing list > > > > Ganglia-general@lists.sourceforge.net > > > > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > > > > > > > ___ > > > Ganglia-general mailing list > > > Ganglia-general@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > > > > > > > > ___ > > Ganglia-general mailing list > > Ganglia-general@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > > > ___ > Ganglia-general mailing list > Ganglia-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-general ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia alert
The only problem with this is that the ganglia-alert script exits immediately, thus systemd keeps restarting it and the PID is increasing continuously. On Mon, 2020-04-13 at 10:35 +0200, Valerio Bellizzomi wrote: > I have adapted the ganglia-alert init script to systemd for Debian: > > > > [Unit] > Description=Ganglia Alert Service > > [Service] > Type=simple > PIDfile=/var/run/ganglia-alert.pid > ExecStart=/home/user/ganglia-alert -d -c /home/user/ganglia_config.txt > Restart=always > > [Install] > WantedBy=multi-user.target > Alias=ganglia-alert.service > > > > > > > > > > > On Wed, 2020-03-25 at 11:58 -0400, Vladimir Vuksan wrote: > > Hi Valerio, > > > > Unfortunately last couple weeks have been pretty rough for many people. > > I don't really have much input on the init script. You are probably best > > off trying to adapt the init script to systemd on your own. > > > > Sincerely, > > > > Vladimir > > > > > > 3/25/20 u 5:50 AM, Valerio Bellizzomi je napisao/la: > > > Is it normal to not get a reply for so long time on this list? > > > > > > > > > On Fri, 2020-03-13 at 05:50 +0100, Valerio Bellizzomi wrote: > > >> Greetings, > > >> I would like to setup ganglia-alert on Debian, but the init script is > > >> very old and does not fit with the new systemd setup. > > >> > > >> suggestions? > > >> > > >> > > >> > > > > > > > > > > > > ___ > > > Ganglia-general mailing list > > > Ganglia-general@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > > > > ___ > > Ganglia-general mailing list > > Ganglia-general@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > > > ___ > Ganglia-general mailing list > Ganglia-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-general ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia alert
I have adapted the ganglia-alert init script to systemd for Debian: [Unit] Description=Ganglia Alert Service [Service] Type=simple PIDfile=/var/run/ganglia-alert.pid ExecStart=/home/user/ganglia-alert -d -c /home/user/ganglia_config.txt Restart=always [Install] WantedBy=multi-user.target Alias=ganglia-alert.service On Wed, 2020-03-25 at 11:58 -0400, Vladimir Vuksan wrote: > Hi Valerio, > > Unfortunately last couple weeks have been pretty rough for many people. > I don't really have much input on the init script. You are probably best > off trying to adapt the init script to systemd on your own. > > Sincerely, > > Vladimir > > > 3/25/20 u 5:50 AM, Valerio Bellizzomi je napisao/la: > > Is it normal to not get a reply for so long time on this list? > > > > > > On Fri, 2020-03-13 at 05:50 +0100, Valerio Bellizzomi wrote: > >> Greetings, > >> I would like to setup ganglia-alert on Debian, but the init script is > >> very old and does not fit with the new systemd setup. > >> > >> suggestions? > >> > >> > >> > > > > > > > > ___ > > Ganglia-general mailing list > > Ganglia-general@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > ___ > Ganglia-general mailing list > Ganglia-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-general ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia alert
On 27/03/2020 09:11, Valerio Bellizzomi wrote: > On Thu, 2020-03-26 at 18:47 +0100, Daniel Pocock wrote: >> >> On 25/03/2020 17:18, Valerio Bellizzomi wrote: >>> On Wed, 2020-03-25 at 11:58 -0400, Vladimir Vuksan wrote: Hi Valerio, Unfortunately last couple weeks have been pretty rough for many people. I don't really have much input on the init script. You are probably best off trying to adapt the init script to systemd on your own. Sincerely, Vladimir >>> >>> Thanks very much. >>> >> >> Yes, we're all here >> >> Instead of using ganglia_alert, I'm using ganglia-nagios-bridge >> >> https://github.com/ganglia/ganglia-nagios-bridge >> >> https://danielpocock.com/ganglia-nagios-bridge/ >> >> I hope you all stay well during the pandemic. > > > thank you. this means however that you are running Nagios, I don't. I believe people have used it with Icinga2 without any code changes. As it is a Python script, it is not too hard to adapt it for any other monitoring system Then you can see the alert status on your preferred dashboard Regards, Daniel ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia alert
On Thu, 2020-03-26 at 18:47 +0100, Daniel Pocock wrote: > > On 25/03/2020 17:18, Valerio Bellizzomi wrote: > > On Wed, 2020-03-25 at 11:58 -0400, Vladimir Vuksan wrote: > >> Hi Valerio, > >> > >> Unfortunately last couple weeks have been pretty rough for many people. > >> I don't really have much input on the init script. You are probably best > >> off trying to adapt the init script to systemd on your own. > >> > >> Sincerely, > >> > >> Vladimir > > > > Thanks very much. > > > > Yes, we're all here > > Instead of using ganglia_alert, I'm using ganglia-nagios-bridge > > https://github.com/ganglia/ganglia-nagios-bridge > > https://danielpocock.com/ganglia-nagios-bridge/ > > I hope you all stay well during the pandemic. thank you. this means however that you are running Nagios, I don't. ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia alert
Hi Valerio, Unfortunately last couple weeks have been pretty rough for many people. I don't really have much input on the init script. You are probably best off trying to adapt the init script to systemd on your own. Sincerely, Vladimir 3/25/20 u 5:50 AM, Valerio Bellizzomi je napisao/la: Is it normal to not get a reply for so long time on this list? On Fri, 2020-03-13 at 05:50 +0100, Valerio Bellizzomi wrote: Greetings, I would like to setup ganglia-alert on Debian, but the init script is very old and does not fit with the new systemd setup. suggestions? ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] ganglia alert
Is it normal to not get a reply for so long time on this list? On Fri, 2020-03-13 at 05:50 +0100, Valerio Bellizzomi wrote: > Greetings, > I would like to setup ganglia-alert on Debian, but the init script is > very old and does not fit with the new systemd setup. > > suggestions? > > > ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] ganglia alert
Greetings, I would like to setup ganglia-alert on Debian, but the init script is very old and does not fit with the new systemd setup. suggestions? ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia Alert and Tracking
Hi Alex: How about something like: contrib/name.tar.gz contrib/name.README ? If other developers are cool with this, then I can create the directory and check stuff into SVN. Do we also want to include in the RPMs? If so, perhaps a ganglia-contrib RPM or something else...? Cheers, Bernard -Original Message- From: Alex Balk [mailto:[EMAIL PROTECTED] Sent: Monday, June 12, 2006 11:04 To: Bernard Li Cc: Stackpole, Chris; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia Alert and Tracking Bernard, There is no contrib directory in the official tarball. Maybe we could create a contrib directory, with a tarball for each addon and a README listing all provided patches/files along with a brief description of each one. Adding/removing addons to would be rather simple - add/remove the tarball and update the README. This way, it's up to the users to decide what flavors they want added - we just provide the ingredients... with the regular user contributed code, your mileage may vary disclaimers and all. Cheers, Alex Bernard Li wrote: Hi Alex: Initially I thought that your code is standalone and could just be checked into ganglia/web/contrib/contrib_name. However, if it requires patching... I'm not sure if that's the best way to do this - do we normally include patches in the contrib/ directory? Regarding the gmetrics repository - I think it would be great if we can get that back... I don't remember exactly about the history of it but obviously an automated system got abused previously and that's why it was taken down. I guess for now I don't mind being the gate-keeper so people can submit them to me and I'll put them up on the website. Do other folks have any comments regarding this? Matt, Martin, can you give me access to the webpage? Cheers, Bernard -- -- *From:* Alex Balk [mailto:[EMAIL PROTECTED] *Sent:* Sat 10/06/2006 01:35 *To:* Bernard Li *Cc:* Stackpole, Chris; ganglia-general@lists.sourceforge.net *Subject:* Re: [Ganglia-general] Ganglia Alert and Tracking Bernard, Sounds cool - maybe we could create a naming convention for web frontend addons? for example, all addons go under $WEBROOT/addons/$PATCH_NAME/. That could help minimize naming conflicts and let user easily determine where the relevant code is located. If you'd like, I could modify the patch to work this way. I'd also love to see the gmetrics repository reopen for user contributions. I have some GPFS and dstat related scripts other may find useful. Let me know how you want it and I'll open a bugzilla entry. Cheers, Alex Bernard Li wrote: Hi Alex: Actually I wonder if we should check your code into trunk as contrib code... what do others think? Cheers, Bernard -Original Message- From: Alex Balk [mailto:[EMAIL PROTECTED] Sent: Friday, June 09, 2006 12:32 To: Bernard Li Cc: Stackpole, Chris; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia Alert and Tracking Bernard Li wrote: I am trying to write a script that pulls the info from netcat and averages out some numbers but I believe that there is a easier way. Does ganglia store data in such a way that I could pull this type of information? This appears so useful to me that I am sure that there are others that have tried this, are there any ideas and suggestions? Sorry for hijacking your thread Chris but your question leads me to think that there are some interesting data stored in the RRD database, perhaps we could write a script to mine this data and provide some interesting historical reports? Actually, my patch for custom graphs accomplishes exactly what you're talking about. It allows you to create a template and then load it for whatever view (meta, cluster, host) you desire. Couple this with gmetrics and you can pretty much generate a graph for anything (read - visually represent any aspect of your data). It also supports rrdtool's CDEFs, so you can do data transformations as well. Oh, and the rendering backend may be called from within an IMG SRC=... which allows creating customized dashboards. I've started working on one where customers can view different utilizations graphs based on the cluster specialty (batch, interactive, infrastructure), NFS statistics, parallel job utilization (how much does process named X consume across multiple hosts), etc. What I'm really missing is a method to generate aggregate data on the fly. Something like take these 3 hosts, all from different clusters, and show me their aggregate CPU
Re: [Ganglia-general] Ganglia Alert and Tracking
Ahh yes, aggregating data in different ways after the fact. We had a need to do that, and also a need to provide more than one cluster heirachy (e.g. clusters grouped by region, but also clusters grouped by technology owner (say)). I have written some perl code to do this - sucking the data out of defined clusters and manually calling rrdtool update for different aggregate views. Doing it is a little tricky I must say, at least for me. The other thing that is a bit disappointing is that if you extract the data from some time range, if the finest grain data does not go back that far, it will use the coarser grained data for the full extract - even in the timeframe where there is finer data. The other step is to get a ganglia instance to understand enough to display this other rollup data. Your choices include faking up appropriate XML on port 8652 to convince a 2nd gmetad instance to display the data, or hacking a 2nd copy of the php tree and replace the code that asks gmetad for data with file based data (say). The perl code is attached for, but this is only for interest. It is too horrible to be usable by others. best regards, richard -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Balk Sent: 09 June 2006 20:32 To: Bernard Li Cc: Stackpole, Chris; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia Alert and Tracking Bernard Li wrote: I am trying to write a script that pulls the info from netcat and averages out some numbers but I believe that there is a easier way. Does ganglia store data in such a way that I could pull this type of information? This appears so useful to me that I am sure that there are others that have tried this, are there any ideas and suggestions? Sorry for hijacking your thread Chris but your question leads me to think that there are some interesting data stored in the RRD database, perhaps we could write a script to mine this data and provide some interesting historical reports? Actually, my patch for custom graphs accomplishes exactly what you're talking about. It allows you to create a template and then load it for whatever view (meta, cluster, host) you desire. Couple this with gmetrics and you can pretty much generate a graph for anything (read - visually represent any aspect of your data). It also supports rrdtool's CDEFs, so you can do data transformations as well. Oh, and the rendering backend may be called from within an IMG SRC=... which allows creating customized dashboards. I've started working on one where customers can view different utilizations graphs based on the cluster specialty (batch, interactive, infrastructure), NFS statistics, parallel job utilization (how much does process named X consume across multiple hosts), etc. What I'm really missing is a method to generate aggregate data on the fly. Something like take these 3 hosts, all from different clusters, and show me their aggregate CPU consumption. Cheers, Alex ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general For more information about Barclays Capital, please visit our web site at http://www.barcap.com. Internet communications are not secure and therefore the Barclays Group does not accept legal responsibility for the contents of this message. Although the Barclays Group operates anti-virus programmes, it does not accept responsibility for any damage whatsoever that is caused by viruses being passed. Any views or opinions presented are solely those of the author and do not necessarily represent those of the Barclays Group. Replies to this email may be monitored by the Barclays Group for operational or business reasons. rollup Description: rollup
Re: [Ganglia-general] Ganglia Alert and Tracking
Bernard, There is no contrib directory in the official tarball. Maybe we could create a contrib directory, with a tarball for each addon and a README listing all provided patches/files along with a brief description of each one. Adding/removing addons to would be rather simple - add/remove the tarball and update the README. This way, it's up to the users to decide what flavors they want added - we just provide the ingredients... with the regular user contributed code, your mileage may vary disclaimers and all. Cheers, Alex Bernard Li wrote: Hi Alex: Initially I thought that your code is standalone and could just be checked into ganglia/web/contrib/contrib_name. However, if it requires patching... I'm not sure if that's the best way to do this - do we normally include patches in the contrib/ directory? Regarding the gmetrics repository - I think it would be great if we can get that back... I don't remember exactly about the history of it but obviously an automated system got abused previously and that's why it was taken down. I guess for now I don't mind being the gate-keeper so people can submit them to me and I'll put them up on the website. Do other folks have any comments regarding this? Matt, Martin, can you give me access to the webpage? Cheers, Bernard *From:* Alex Balk [mailto:[EMAIL PROTECTED] *Sent:* Sat 10/06/2006 01:35 *To:* Bernard Li *Cc:* Stackpole, Chris; ganglia-general@lists.sourceforge.net *Subject:* Re: [Ganglia-general] Ganglia Alert and Tracking Bernard, Sounds cool - maybe we could create a naming convention for web frontend addons? for example, all addons go under $WEBROOT/addons/$PATCH_NAME/. That could help minimize naming conflicts and let user easily determine where the relevant code is located. If you'd like, I could modify the patch to work this way. I'd also love to see the gmetrics repository reopen for user contributions. I have some GPFS and dstat related scripts other may find useful. Let me know how you want it and I'll open a bugzilla entry. Cheers, Alex Bernard Li wrote: Hi Alex: Actually I wonder if we should check your code into trunk as contrib code... what do others think? Cheers, Bernard -Original Message- From: Alex Balk [mailto:[EMAIL PROTECTED] Sent: Friday, June 09, 2006 12:32 To: Bernard Li Cc: Stackpole, Chris; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia Alert and Tracking Bernard Li wrote: I am trying to write a script that pulls the info from netcat and averages out some numbers but I believe that there is a easier way. Does ganglia store data in such a way that I could pull this type of information? This appears so useful to me that I am sure that there are others that have tried this, are there any ideas and suggestions? Sorry for hijacking your thread Chris but your question leads me to think that there are some interesting data stored in the RRD database, perhaps we could write a script to mine this data and provide some interesting historical reports? Actually, my patch for custom graphs accomplishes exactly what you're talking about. It allows you to create a template and then load it for whatever view (meta, cluster, host) you desire. Couple this with gmetrics and you can pretty much generate a graph for anything (read - visually represent any aspect of your data). It also supports rrdtool's CDEFs, so you can do data transformations as well. Oh, and the rendering backend may be called from within an IMG SRC=... which allows creating customized dashboards. I've started working on one where customers can view different utilizations graphs based on the cluster specialty (batch, interactive, infrastructure), NFS statistics, parallel job utilization (how much does process named X consume across multiple hosts), etc. What I'm really missing is a method to generate aggregate data on the fly. Something like take these 3 hosts, all from different clusters, and show me their aggregate CPU consumption. Cheers, Alex ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia Alert and Tracking
Hi Alex: Initially I thought that your code is standalone and could just be checked into ganglia/web/contrib/contrib_name. However, if it requires patching... I'm not sure if that's the best way to do this - do we normally include patches in the contrib/ directory? Regarding the gmetrics repository - I think it would be great if we can get that back... I don't remember exactly about the history of it but obviously an automated system got abused previously and that's why it was taken down. I guess for now I don't mind being the gate-keeper so people can submit them to me and I'll put them up on the website. Do other folks have any comments regarding this? Matt, Martin, can you give me access to the webpage? Cheers, Bernard From: Alex Balk [mailto:[EMAIL PROTECTED] Sent: Sat 10/06/2006 01:35 To: Bernard Li Cc: Stackpole, Chris; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia Alert and Tracking Bernard, Sounds cool - maybe we could create a naming convention for web frontend addons? for example, all addons go under $WEBROOT/addons/$PATCH_NAME/. That could help minimize naming conflicts and let user easily determine where the relevant code is located. If you'd like, I could modify the patch to work this way. I'd also love to see the gmetrics repository reopen for user contributions. I have some GPFS and dstat related scripts other may find useful. Let me know how you want it and I'll open a bugzilla entry. Cheers, Alex Bernard Li wrote: Hi Alex: Actually I wonder if we should check your code into trunk as contrib code... what do others think? Cheers, Bernard -Original Message- From: Alex Balk [mailto:[EMAIL PROTECTED] Sent: Friday, June 09, 2006 12:32 To: Bernard Li Cc: Stackpole, Chris; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia Alert and Tracking Bernard Li wrote: I am trying to write a script that pulls the info from netcat and averages out some numbers but I believe that there is a easier way. Does ganglia store data in such a way that I could pull this type of information? This appears so useful to me that I am sure that there are others that have tried this, are there any ideas and suggestions? Sorry for hijacking your thread Chris but your question leads me to think that there are some interesting data stored in the RRD database, perhaps we could write a script to mine this data and provide some interesting historical reports? Actually, my patch for custom graphs accomplishes exactly what you're talking about. It allows you to create a template and then load it for whatever view (meta, cluster, host) you desire. Couple this with gmetrics and you can pretty much generate a graph for anything (read - visually represent any aspect of your data). It also supports rrdtool's CDEFs, so you can do data transformations as well. Oh, and the rendering backend may be called from within an IMG SRC=... which allows creating customized dashboards. I've started working on one where customers can view different utilizations graphs based on the cluster specialty (batch, interactive, infrastructure), NFS statistics, parallel job utilization (how much does process named X consume across multiple hosts), etc. What I'm really missing is a method to generate aggregate data on the fly. Something like take these 3 hosts, all from different clusters, and show me their aggregate CPU consumption. Cheers, Alex
Re: [Ganglia-general] Ganglia Alert and Tracking
Bernard, Sounds cool - maybe we could create a naming convention for web frontend addons? for example, all addons go under $WEBROOT/addons/$PATCH_NAME/. That could help minimize naming conflicts and let user easily determine where the relevant code is located. If you'd like, I could modify the patch to work this way. I'd also love to see the gmetrics repository reopen for user contributions. I have some GPFS and dstat related scripts other may find useful. Let me know how you want it and I'll open a bugzilla entry. Cheers, Alex Bernard Li wrote: Hi Alex: Actually I wonder if we should check your code into trunk as contrib code... what do others think? Cheers, Bernard -Original Message- From: Alex Balk [mailto:[EMAIL PROTECTED] Sent: Friday, June 09, 2006 12:32 To: Bernard Li Cc: Stackpole, Chris; ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia Alert and Tracking Bernard Li wrote: I am trying to write a script that pulls the info from netcat and averages out some numbers but I believe that there is a easier way. Does ganglia store data in such a way that I could pull this type of information? This appears so useful to me that I am sure that there are others that have tried this, are there any ideas and suggestions? Sorry for hijacking your thread Chris but your question leads me to think that there are some interesting data stored in the RRD database, perhaps we could write a script to mine this data and provide some interesting historical reports? Actually, my patch for custom graphs accomplishes exactly what you're talking about. It allows you to create a template and then load it for whatever view (meta, cluster, host) you desire. Couple this with gmetrics and you can pretty much generate a graph for anything (read - visually represent any aspect of your data). It also supports rrdtool's CDEFs, so you can do data transformations as well. Oh, and the rendering backend may be called from within an IMG SRC=... which allows creating customized dashboards. I've started working on one where customers can view different utilizations graphs based on the cluster specialty (batch, interactive, infrastructure), NFS statistics, parallel job utilization (how much does process named X consume across multiple hosts), etc. What I'm really missing is a method to generate aggregate data on the fly. Something like take these 3 hosts, all from different clusters, and show me their aggregate CPU consumption. Cheers, Alex
Re: [Ganglia-general] Ganglia Alert and Tracking
I am trying to write a script that pulls the info from netcat and averages out some numbers but I believe that there is a easier way. Does ganglia store data in such a way that I could pull this type of information? This appears so useful to me that I am sure that there are others that have tried this, are there any ideas and suggestions? Sorry for hijacking your thread Chris but your question leads me to think that there are some interesting data stored in the RRD database, perhaps we could write a script to mine this data and provide some interesting historical reports? Anyways, just an idea... Cheers, Bernard
Re: [Ganglia-general] Ganglia Alert and Tracking
Sorry I haven't had time to come through with my offered scripts yet... I've got setups for exporting Ganglia XML to Nagios host/hostgroup configs, as well as doing host/service checks from the RRD data as well as directly from the XML ports. This is _very_handy_. I'm trying to get some other things out of the way and clean these for public consumption... It's actually that and the legal part (release) that's holding me up, sorry. I'll do this real soon now. As far as pulling data out of the RRD's for reporting, I've had to whip up some things in the past for dumping to CSV to be inclusive with other reports... Doing this for Cacti and Ganglia was quite the learning experience. What I mainly took out of it is a disdain for the format and tools, and new motivation to move to storing everything in a RDBMS and away from RRD entirely. :) /eli On 6/9/06 11:29 AM, Stackpole, Chris [EMAIL PROTECTED] wrote: Good day, I have two topics I would like to gather opinions on from the Ganglia world. The first is that I have been looking at several different aspects of being able to receive alerts when a node goes down. At the moment I have just MacGyver'd a solution, based off a script Richard gave me. It just sends an email when a node stops reporting. I am still working out issues in it and I would like something a little more detailed. Looking through the archive, I noticed a few discussions on nagios plug-ins but from what I have read I understand that it is a completely different ballgame. I would like to ask the ganglia group what program they use to send alerts and if they would share their experiences on system alerts. On another note. After monitoring for a while, one thing that Ganglia has brought to my attention is that a few of the servers were WAY to heavily loaded and never left the red while others really didn't seem to do anything and rarely left the blue. Now I am in the middle of off loading the work onto the least used systems and would like to include the data in my reports. Basically what I am after is that I would like to have a report at the end of the week that tells me ComputerA was under heavy load 90% of the time, ComputerB did jack squat this past week, and ComputerC maintained a 50-80% work load this past week. Ganglia is great to eyeball the situation and do quick estimates of load-balancing but I would like view some raw data as well as the graphs. I am trying to write a script that pulls the info from netcat and averages out some numbers but I believe that there is a easier way. Does ganglia store data in such a way that I could pull this type of information? This appears so useful to me that I am sure that there are others that have tried this, are there any ideas and suggestions? Any comments are welcome. Thanks everyone! Chris Stackpole ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia Alert and Tracking
Hi Eli: As far as pulling data out of the RRD's for reporting, I've had to whip up some things in the past for dumping to CSV to be inclusive with other reports... Doing this for Cacti and Ganglia was quite the learning experience. What I mainly took out of it is a disdain for the format and tools, and new motivation to move to storing everything in a RDBMS and away from RRD entirely. There was talk about replacing RRD with MySQL by some users previously, I wonder how that went... Now that I think about it, it makes more sense to archive data in MySQL and if you want historical data/graphs, then you go to MySQL (as opposed to RRD for live feeds). I suppose that's what you're doing (sort of)? Cheers, Bernard
Re: [Ganglia-general] Ganglia Alert and Tracking
On 6/9/06 11:52 AM, Bernard Li [EMAIL PROTECTED] wrote: Hi Eli: As far as pulling data out of the RRD's for reporting, I've had to whip up some things in the past for dumping to CSV to be inclusive with other reports... Doing this for Cacti and Ganglia was quite the learning experience. What I mainly took out of it is a disdain for the format and tools, and new motivation to move to storing everything in a RDBMS and away from RRD entirely. There was talk about replacing RRD with MySQL by some users previously, I wonder how that went... Now that I think about it, it makes more sense to archive data in MySQL and if you want historical data/graphs, then you go to MySQL (as opposed to RRD for live feeds). I suppose that's what you're doing (sort of)? Cheers, Bernard Yep, what with trying to integrate a bunch of open-source tools with unfriendly (RRD) to evil (Nagios) data storage, I actually think that's relatively hopeless. There are a couple of projects with the requisite backend down pat (Zabbix, MOODSS). Unfortunately, due to their relatively new nature, they're missing some of the more critical functionality that Nagios has (uber-configurable backend, web (not binary) UI), as well as there being nothing that can touch Ganglia wrt efficiency and simplicity of monitoring an actual cluster of *nix hosts. Anyhow, I'd love to see Ganglia (gmetad) with the option to pass commits to an external program in addition to the RRD handling internally... That would save the project (coders :) from having to develop an entire schema and database handling tasklist. It would also be great for (at least me), where I don't WANT a super-tool that does it all, but would rather keep using individual tools for their great parts (Cacti, Ganglia) but want to divert the data they collect into a single schema for aggregation of all output into one database. I've got a (now vapor) project on my plate to do this, currently Cacti is the easiest to handle forking data to a database. Nagios data is actually hell it looks like, and Ganglia non-possible (at the moment). There are some patches to rrdtool that aren't mainlined, that support MySQL... But it'd be far preferable to be DB-agnostic, and I don't think that it would be possible to use that directly with Ganglia anyhow. Any other thoughts or desire for others to have similar functionality? /eli