[Ganglia-general] Ganglia and Nagios: warning / critical state in check_ganglia_metric.sh
Hi, I'm trying to integrate Nagios with Ganglia. I stucked in one place and somehow can't find a solution. Based on Ganglia Book, chapter 7, Check a Single Metric on a Specific Host we can set a check_command like below: check_ganglia_metric!load_one!more!5 And it is said that: The operators specified in the Nagios definitions for the Ganglia plug- ins always indicate the “critical” state. If you use a notequal operator, it means that state is critical if the value is not equal. Now.. I'm trying to set a 'warning' state not critical. And can't find out-of-the-box solution - even on the authors webpage http://vuksan.com/linux/nagios_scripts.html I assume that I should write my own hooks for this. Could you tell me how You do It? Regards, Maciej Lasyk GPG public key: http://maciek.lasyk.info/gpg_maciej_lasyk.asc -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Empty graphs appear for remote hosts
Ok so check your gmond.conf and make sure that send_metadata_interval is higher than 0 (60 for instance). From the manual: send_metadata_interval (integer_value in seconds): Establishes the interval at which gmond will send or resend the metadata packets that describe each enabled metric. This directive by default is set to 0, which means that gmond will send the metadata packets only at startup and upon request from other gmond nodes running remotely. If a new machine running gmond is added to a cluster, it needs to announce itself and inform all other nodes of the metrics that it currently supports. In multicast mode, this isn’t a problem, because any node can request the metadata of all other nodes in the cluster. However, in unicast mode, a resend interval must be established. The interval value is the minimum number of seconds between resends. regards, Maciej Lasyk GPG key ID: 4FED49C5 GPG public key: http://maciek.lasyk.info/gpg_maciej_lasyk.asc On Mon, Nov 11, 2013 at 8:52 PM, Stas Oskin stas.os...@gmail.com wrote: On Mon, Nov 11, 2013 at 9:32 PM, Maciej Lasyk mac...@lasyk.info wrote: Multicast or unicast? Unicast. Btw - we're not continuing this on ganglia-general? ;) Sorry, copied. -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] some questions about ganglia
Hi, I believe that I'm experiencing deja-vu... ;) Ad1: Users can find easily any metric in gweb. Just use Aggregate graphs or views functions. If for some reason You'd like to completely switch off some graphs you could simply cut not needed metrics o nodes by removing those from /etc/ganglia/gmond.conf. You can also group metrics so users could easily see only those they want (for example in gweb/conf(_default).php - just read this whole file and set things the way it feets your needs. Also remember that you could create your own web template. Ad2: RRDs are greatly configurable regarding to storage needs. Just reconfigure those in /etc/ganglia/gmetad.conf (section: Round-Robin Archives) regards, Maciej Lasyk GPG key ID: 4FED49C5 GPG public key: http://maciek.lasyk.info/gpg_maciej_lasyk.asc On Tue, Nov 12, 2013 at 3:02 AM, 酃點℡ lqs...@foxmail.com wrote: Hi all, I am a user of ganglia. When I used ganglia , most of the functions work fine, but there still some problems confused me: 1. As all we known, ganglia web view so many metrics. Is there a way to cut some metrics that user don't care about so that they can easily find the metric they want ? 2. Ganglia stored the data into Round Robin Database, as the time passed by , the file size of rrd storage grown up. So Is their a way just keep half of year data or only one month? if anyone know anything about it , please answer me. Thanks! Best Regards -- Allen -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] gmond :: extract number of cores from xml
I beleive that you're looking for 'cpu_num': METRIC NAME=cpu_num VAL=2 TYPE=uint16 UNITS=CPUs TN=910 TMAX=1200 DMAX=0 SLOPE=zero pozdrawiam, Maciej Lasyk GPG key ID: 4FED49C5 GPG public key: http://maciek.lasyk.info/gpg_maciej_lasyk.asc On Sun, Nov 24, 2013 at 5:10 PM, Adrian Sevcenco adrian.sevce...@cern.chwrote: Hi! Can somebody give me and hint/info about how can i extract the number of cores from hosts from the gmond xml output? (what metrics exactly i must read) Thanks! Adrian -- Shape the Mobile Experience: Free Subscription Software experts and developers: Be at the forefront of tech innovation. Intel(R) Software Adrenaline delivers strategic insight and game-changing conversations that shape the rapidly evolving mobile landscape. Sign up now. http://pubads.g.doubleclick.net/gampad/clk?id=63431311iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- Shape the Mobile Experience: Free Subscription Software experts and developers: Be at the forefront of tech innovation. Intel(R) Software Adrenaline delivers strategic insight and game-changing conversations that shape the rapidly evolving mobile landscape. Sign up now. http://pubads.g.doubleclick.net/gampad/clk?id=63431311iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time
Are you afraid that we could see performance data of the Curiosity? :D First of all I would really suggest you read the Monitoring with Ganglia book (2012). It answers many questions and solves major problems. About your issue: 1. How do you set deaf and mute in gmond nodes? 2. How many listening gmonds (aggregators, hosts with deaf=no) do you have? (if using multicast than probably by default all gmond hosts are aggregators) 3. What is the size of the downloaded XML (telnet to gmond aggregator on port set in tcp_accept_channel)? Does it contain all hosts you monitor (write XML content to file and grep looking for 'HOST NAME' or smt like that) 4. Do you have any ACLs set in gmond configuration? 5. Btw - in the config section you shared you have a white-space in port number 8 204: /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.71 port = 8 204 bind = 239.2.11.71 } Cheers, Maciej Lasyk GPG key ID: 4FED49C5 GPG public key: http://maciek.lasyk.info/gpg_maciej_lasyk.asc On Thu, Dec 4, 2014 at 9:20 PM, Chris Jones christopher.r.jo...@nasa.gov wrote: Being that I work at NASA, I'd rather not put entire files out there with names of hosts and ports and the like. :) My initial post had in it part of the gmond config's. My datasource line in my gmetad.conf file (for this one port) is simply something like this: data_source my_name gmond_hostA:8204 gmond_hostB:8204 If there's anything else specifically, just ask and I'll give it (with names changed to protect the innocent). -chris On 12/4/14, 3:15 PM, Maciej Lasyk wrote: Plz share your configs via pastbin Cheers, On December 4, 2014 9:06:08 PM CET, Chris Jones christopher.r.jo...@nasa.gov wrote: I'm still racking my brain with this problem I'm having. I've even ran 'tcpdump -i any port 8204' on my gmetad server and watched the traffic when I've got two gmond clients sending out multicast packets on port 8204 I can see handshaking between my server and *one* client. The other client via the tcpdump just shows packets being sent out - and no replying. On the server gui, I see only the one client showing up. I then stop gmond on the client that's 'working' and immediately on my other client, the tcpdump output changes to handshaking between the client and server - and the server's tcpdump also then changes to show the new client (the old one stops). Then eventually on the server gui I stop seeing the old client updating (the icon for the host turns that block of red... 'host down') and my new client shows up like nothing ever happened. This mak es no sense. I don't believe I've oversubscribed the number of gmond's on my server (around 150 maybe?). The gmetad server is running RHEL 6.2, and my two gmond clients are running RHEL 6.5. The strange thing is, it appears that only my RHEL 6.5 clients are having this problem. every other gmond client is either RHEL 5.x or SuSE 11.1 or 11.2. I've googled this problem til I'm blue in the face, gone back through the last few years of the ganglia-general mailing list archives as best I could with keyword searches, consulted many of my system admin. co-workers, and even tried using unicast instead of multicast (that didn't make a difference either). Nothing seems to matter. There's got to be somebody out there reading this mailing list who's got RHEL6.5 gmond clients. Anybody? Please? :) Thanks, -chris On 9/4/14, 12:46 PM, Karol Korytkowski wrote: I'm curious as of what the correct answer would be, but.. We have similar problem (forgive if not, I just scanned through your email), and some kind of solution was to use different data_source (@gmetad) for each of such issues and give them same cluster { name = } (@gmond). I think this has something to do with multicasts between switches, but so far noone has looked into this.. KK On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones christopher.r.jo...@nasa.gov mailto:christopher.r.jo...@nasa.gov wrote: Here's my scenario. I've got some systems that were happily reporting in ganglia and they had to have their OS'es rebuilt. They're now running RHEL 6.5. I can be on my gmetad server, and tcpdump looking for packet s from host1 and host2 and only see one. Both host1 host2 are running with the exact same gmond.conf configuration... same port. They both appear to be running correctly. But one shows more activity than the other when I run a 'netstat -an | grep 8204' (8204 is the port they run on). When I run 'telnet localhost