[Ganglia-general] gmetad error in writing the correct rrds and various web errors
Hi! I have several problems with ganglia .. chapter one : ganglia 3.1.7 starting some point in time (after some trying of some extensions) i dont get anymore the info from the frontend in rrds ... in gmond it seem that i have info : root@alien: ISS-ALICE # telnet localhost 8649 | grep alien.local HOST NAME=alien.local IP=172.18.0.1 REPORTED=1350498093 TN=24 TMAX=20 DMAX=0 LOCATION=0,0,0 GMOND_STARTED=0 but for a node root@alien: ISS-ALICE # telnet localhost 8651 | grep alien-0-10 HOST NAME=alien-0-10.local IP=172.18.3.244 REPORTED=1350498224 TN=18 TMAX=20 DMAX=0 LOCATION=0,10,0 GMOND_STARTED=1350476484 i am really at loss why this is not working as it is supposed to work .. i even enabled full php error debugging in the hope that it is a web error ... the page is http://alien.spacescience.ro/ganglia/ as an alternative i tried the latest ganglia web with the much worse result : http://alien.spacescience.ro/ganglia2/ Can someone advice me how can i debut this further and where should i look? (both for the gmetad error and the fact that in ganglia 3.4.0 (/ganglia2) i have no graphs) Thanks a lot! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] no graphs displayed but png creation ok
On 10/19/2012 06:36 PM, Vladimir Vuksan wrote: I am not sure how to make sure this doesn't happen again but please check your conf.php and make sure there are no trailing new lines ie. last line should be ? Graphs are getting generated however there is an extra newline that is inserted at the top that breaks PNG :-(. Thanks a lot!!! (and thanks for all your work :) ) Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] help :: making grids of gmetads
Hi! I am a little bit lost on the subject of making grids of metads .. how can i do something like : gmetad1 gmetad2 (that take data from theri gmonds) \ / gmetad3 - gmond of this machine that takes different other data Also the hierarchy of grids and clusters is made at gmetad or gmond level? did i understood correctly that gmetad just define data sources (gmonds and gmetads) but the exact hierarchy is done at gmond level? But if yes, where comes into play the gridname ? Also, what can i do for taking all data for the other metads not only summary data? Thanks! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] help :: making grids of gmetads (and question about monitoring topology)
On 10/22/2012 02:18 PM, Adrian Sevcenco wrote: Hi! I am a little bit lost on the subject of making grids of metads .. how can i do something like : gmetad1 gmetad2 (that take data from theri gmonds) \ / gmetad3 - gmond of this machine that takes different other data Also the hierarchy of grids and clusters is made at gmetad or gmond level? did i understood correctly that gmetad just define data sources (gmonds and gmetads) but the exact hierarchy is done at gmond level? But if yes, where comes into play the gridname ? Also, what can i do for taking all data for the other metads not only summary data? Hi! I have other questions about gmond and gmetad : Is it posible that a gmond to be datasource for multiple gmetads? i would want something like : gmond_wn_1 ... gmond_wn_n \ / \ / \ / gmond_frontend_1 gmetad_frontend \ \ gmond_central gmetad_central is this posible?? Thank you! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia network monitoring information problems
On 12/05/2012 05:10 PM, Hong Wayne wrote: What's the difference between byte_in.rrd and pkts_in.rrd? One is the number of bytes that were received and the other is the number of packets .. http://en.wikipedia.org/wiki/Network_packet HTH, Adrian smime.p7s Description: S/MIME Cryptographic Signature -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gmond :: clarification needed for deaf and mute options
Hi! It is not clear to me what and how data circulate between gmonds. So .. so for i understand this (for simplicity lets consider unicast) : 1. udp channels are used for actual data traffic (push architecture) 1.1 gmond nodes send data with the send parameters from udp_send_channel 1.2 the aggregating node receive data on parameters from udp_recv_channel 2. tcp channel is used for interrogation by gmetad about the respective aggregated cluster (data cluster - as the actual cluster groupping is given by cluster directive and gmetad will aggregate data corresponding to that) So, i would need some confirmation about my understanding as this info is not explicitly spelled anywhere... And now the subject of the email : deaf and mute options are referring ONLY to udp channels? deaf : do not receive data from other gmonds mute : do not send data to other gmonds? the thing is that if i set deaf = yes there is no longer any listening port, udp AND tcp ! so gmetad query cannot take place! if i set mute = yes, deaf = no the gmond will answer to nc localhost 8649 (tcp) but will no longer send data to gmetad (host down in web page) So .. what exactly is deaf and mute? And is it possible to update the gmond.conf with more descriptive options? (like send_data, receive data, answer_query) Thank you! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] gmond :: clarification needed for deaf and mute options
On 07/25/2013 12:41 PM, Nikhil wrote: Hi Adrian, Hi! concerns in the architecture. Ganglia works well with the multicast configuration as is by default. If the number of nodes in the cluster the thing is that the clusters (the worker nodes) are separated physically or logically (vlans or separate switches) by the rest of the network (access to outside or to storage elements is done via frontend or by having storages be with an interface in the broadcast domain of workernodes) .. so the multicast is usefull only inside the cluster.. but not outside If you are going to follow unicast model for ganglia configuration, it holds good too. In unicast model, you do not really need any tinkering with these options in gmond configuration, as all the metrics are just outbound to the channel configured in udp_send_channel block. The designated gmond node can again further be used as a data source by gmetad, since it will have all the metrics of the cluster nodes send to it(them, dependingly on number of the udp send channels configured) directly. ok, so i can do some kind of tree of data flow up to a single central gmond where gmetad will take data and arrange it per cluster definition from each gmond data flow? There is a small gotcha though -- with the use of multicast topology in ganglia large base installations and use of deaf mode (deaf = yes) configuration on all nodes combined with a few set of nodes running in 'deaf = no' mode, the gmond process need not be restarted every time you add or remove any designated nodes. This can be a good operational win so that means that in unicast mode i have to restart gmond aggregator at each addition of a reporting node??? With the unicast model, however, with the addition or removal of the aggregator nodes, gmond process across the cluster have to be restarted to read the new configuration since all the traffic is outbound via unicast ipaddress. so given an list of 3 aggregators each with its reporting nodes 1 - 2 - 3 - central any changes in the reporting to an aggregator implies a gmond restart for the coresponding gmond and ALL the upper gmonds??? (lets say a new node report to aggregator 2; this implies restarting 2,3 and central??) Thanks a lot! Adrian Thanks, Nikhil On Thu, Jul 25, 2013 at 11:30 AM, Adrian Sevcenco adrian.sevce...@cern.ch mailto:adrian.sevce...@cern.ch wrote: Hi! It is not clear to me what and how data circulate between gmonds. So .. so for i understand this (for simplicity lets consider unicast) : 1. udp channels are used for actual data traffic (push architecture) 1.1 gmond nodes send data with the send parameters from udp_send_channel 1.2 the aggregating node receive data on parameters from udp_recv_channel 2. tcp channel is used for interrogation by gmetad about the respective aggregated cluster (data cluster - as the actual cluster groupping is given by cluster directive and gmetad will aggregate data corresponding to that) So, i would need some confirmation about my understanding as this info is not explicitly spelled anywhere... And now the subject of the email : deaf and mute options are referring ONLY to udp channels? deaf : do not receive data from other gmonds mute : do not send data to other gmonds? the thing is that if i set deaf = yes there is no longer any listening port, udp AND tcp ! so gmetad query cannot take place! if i set mute = yes, deaf = no the gmond will answer to nc localhost 8649 (tcp) but will no longer send data to gmetad (host down in web page) So .. what exactly is deaf and mute? And is it possible to update the gmond.conf with more descriptive options? (like send_data, receive data, answer_query) Thank you! Adrian -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net mailto:Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- -- Adrian Sevcenco, Ph.D. | Institute of Space Science - ISS, Romania| adrian.sevcenco at {cern.ch,spacescience.ro} | -- smime.p7s Description: S/MIME Cryptographic Signature -- See everything from the browser to the database with AppDynamics Get end
Re: [Ganglia-general] gmond :: clarification needed for deaf and mute options
On 07/25/2013 07:37 PM, Nikhil wrote: gmond aggregator is the node which has all the metrics (all the gmond instances send the metrics to this gmond node), it need not be restarted every time a reporting node(clients) are added/removed. The problem happens when the aggregator node goes down; clients will not be able to send the metrics then, so the choice would be to configure multiple aggregators(lets say 2) and have them listed in udp_send_channels of gmond configuration of every node. This means all gmond clients will be sending the metrics simultaneously to both the aggregators. Looking around got me got this links for better explanation with pictures. http://books.google.co.in/books?id=X9haRQi85hkClpg=PA21ots=n531RMIaSrdq=ganglia%20unicast%20topologypg=PA22#v=onepageq=ganglia%20unicast%20topologyf=false thanks for info! I just ordered locally this book :) These aggregators can then be used as source for metric data by gmetad.conf. Since both nodeA and nodeB both have the same view of the entire cluster metric, they would serve each other as a backup copy. for example: data_source clusterC nodeAipaddress:gmondport nodeBipaddress:gmondport gmetad always tries to connect only one node at a time and polls for the metric data on tcp. If nodeAipaddress goes down or is not reachable, gmetad will try the next node in the series that is nodeB, for redundancy. k great! You may want to read this thread also to get an idea for you to consider the unicast topology you want to consider http://www.mail-archive.com/ganglia-general@lists.sourceforge.net/msg07265.html interesting mail (too bad that it didn't got an answer) At this point i am thinking to have a layers of aggregators (at the edge of clusters).. i would make the cluster gmonds (usually the gmonds on the torque and slurm servers) to join another multicast address than the one used inside the clusters together with 1 or 2 central gmonds (as datasurces to gmetad) to create a separate layer of traffic. I have only 100+ servers and in the near future i will not go over 200.. Could the mcast chatter be a problem for the network? gmond estimate a bandwidth of 60 bytes/s per node and i have 3 clusters and a few separate nodes .. i would estimate in the upper gmond layer an chatter at the level of tens of k... Is this workable and efficient? Thanks a lot for your help! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gmetad wish list (name of rrd directories + storage backend)
Hi! I was looking over the ganglia wishlist (which i think should be updated as there are solved items) and the named rrd directory item picked my attention as this was a problem of mine ... I was wondering why for the servers (where gmond can run) is not used dmidecode -s system-uuid ... it is true that dmidecode package would become an requirement .. (and gmetric should take as first argument an uuid) The second item is from personal wishlist : is it possible to have another storage backend beside rrds? i use another monitoring tool called MonaLisa (written in java) which stores the data in postgres tables and it is doing it very efficient ... As far i seen the data is storage by creating time series for each metric for each host from each cluster_id ... Wouldn't be possible to make in postgres tables with time_stamp,metric pairs for each host (UUID) and a table with host_uuid,cluster_id pair? Another thing i was wondering : why gmetad do data processing? (like summaries) naively (i dont have much experience) i would think that it be useful to have a clear separation (gmond is a message passing instrument, gmetad is a storage (only) data instrument and the gweb should process and present/publish the data)... Any thoughts on this? Thanks a lot for patience! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] multiple source metad :: empty graphs
Hi! I installed and tried a simple configuration and it seems that i missed something : so i have the aggregator named monitor the gmond from monitor have the cluster id of ISS Misc to this gmond i have another gmond with the same id that send data. on the monitor gmetad i have two sources : the local gmond and an older rocks installation with ganglia in place data_source ISS MONITOR localhost:8649 data_source ALICE-ISS 172.20.0.17:8651 the problem is that i have empty graphs for the data from local gmond (local host + external server): http://monitor.spacescience.ro/ganglia a nc on localhost 8649 return a valid xml but with all values 0.. any idea what can be wrong? Also, i tried to take data directly from gmond to gmond .. it seems that it didnt worked as expected : fro the older cluster the head gmond is in multicast network with the rest of the nodes .. so when i added a send channel to monitor gmond i would have expected to see all hosts from that cluster in under their group of cluster id .. it didnt happened ... is this expected? Thanks a lot! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out. http://pubads.g.doubleclick.net/gampad/clk?id=49501711iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gmond :: CLUSTER tag ( cluster {name}) clarification
Hi! I would need a clarification about the gmond cluster name: it is not clear to me if 1. the cluster name is a tag for the machine in order to be classified by that tag (so its a per host tag and an aggregator gmond pass that tag to gmetad which order the hosts based on tag) 2. it is a name that is given to group of metrics that are aggregated by that gmond. Thanks! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out. http://pubads.g.doubleclick.net/gampad/clk?id=49501711iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gmond :: extract number of cores from xml
Hi! Can somebody give me and hint/info about how can i extract the number of cores from hosts from the gmond xml output? (what metrics exactly i must read) Thanks! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Shape the Mobile Experience: Free Subscription Software experts and developers: Be at the forefront of tech innovation. Intel(R) Software Adrenaline delivers strategic insight and game-changing conversations that shape the rapidly evolving mobile landscape. Sign up now. http://pubads.g.doubleclick.net/gampad/clk?id=63431311iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] multiple clusters with just one collector
On 01/24/2014 10:47 AM, Cristovao Jose Domingues Cordeiro wrote: Thank you for your replies Bernard and Shekar. Actually in the meanwhile I managed to make it work. For those who might want to do the same here's what I did: · Stop gmond and gmetad, and clean the RRD database in the collector (I was getting unspecified clusters and cached information so this was a clean start) · I have 5 clusters so I've created 5 gmond.conf files - gmond-cluster1.conf, gmond-cluster2.conf...etc. In these files I've set up the cluster name, both udp and tcp ports to the same number, and the hostname in the udp_send_channel to 'localhost' Hi! Is it possible to post your gmond settings? How did you solve the problem that the reporting node (where aggregation gmond sits) will also report the self metrics? My problem is a little different : i try to make a fake gmond for gmetric input for UPS/RAID/Switches metrics. Thanks a lot! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] multiple clusters with just one collector
On 01/25/2014 08:02 PM, Bernard Li wrote: Hi Adrian: Hi! You might want to look into spoofing... it might help with what you're doing. it helps in the sense that i choose the name of package data (e.g data comes from ups1). The problem is the destination : i want to have a distinct cluster (a cluster for upses, one for raid devices, one for switches) so i have to send data to a gmond that reports nothing else. (no other metrics at all, just what i send with gmetric) if i delete all metrics i just get that no metrics defined Thanks! Adrian Cheers, Bernard On Sat, Jan 25, 2014 at 4:19 AM, Adrian Sevcenco adrian.sevce...@cern.ch wrote: On 01/24/2014 10:47 AM, Cristovao Jose Domingues Cordeiro wrote: Thank you for your replies Bernard and Shekar. Actually in the meanwhile I managed to make it work. For those who might want to do the same here's what I did: · Stop gmond and gmetad, and clean the RRD database in the collector (I was getting unspecified clusters and cached information so this was a clean start) · I have 5 clusters so I've created 5 gmond.conf files - gmond-cluster1.conf, gmond-cluster2.conf...etc. In these files I've set up the cluster name, both udp and tcp ports to the same number, and the hostname in the udp_send_channel to 'localhost' Hi! Is it possible to post your gmond settings? How did you solve the problem that the reporting node (where aggregation gmond sits) will also report the self metrics? My problem is a little different : i try to make a fake gmond for gmetric input for UPS/RAID/Switches metrics. Thanks a lot! Adrian -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general smime.p7s Description: S/MIME Cryptographic Signature -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] multiple clusters with just one collector
On 01/25/2014 09:37 PM, Sergio Ballestrero wrote: Hello Adrian, if the host for which you send gmetrics is not a gmond client, you need to also spoof a heartbeat metric, else the collector will think the host is down : yeah, i already banged my head on the wall for this... the things is that i dont know how fust this heartbeat is .. i imagine that either i should send gmetrics faster than this heartbeat or increase the time between heartbeats (at this moment it is not clear to me how this is related to dmax and tmax ... ) is dmax the time-width of the metric value and tmax the heartbeat? so the cronjob should have an size = then tmax? HOST=ups.example.com IP=$(getent hosts $HOST | cut -d -f1) GMON=/etc/ganglia/gmond.UPS.conf T=$(ping -qn -c 1 -w 6 $IP | grep rtt | cut -d/ -f 5 | sed s/\\.//) if [ $T ] ; then gmetric -c $GMON -n ping_time -t uint32 -u us -x 600 -d 7200 -g network -T ping avg time -H -S ${IP}:${HOST} -v ${T} (your custom gmetrics follow here) and of course in the gmond.UPS.conf you put your separate collector gmond host and port. For the collector host, I run separate client gmond and multiple collector gmond. The latter have separate config files with no udp_send_channel, in this way they don't send to themselves their !! THANK YOU!! i didn't think about this! own host metrics. I think I posted on the list my gmondsrv init.d script last year or so... mail me if you don't find it. yes, i have it (you actually responded to one of my post ), thanks a lot! my problem was this thing with self metrics (i cannot imagine how such a small thing did not cross my mind!!!) Cheers, Sergio PS if it's an APC, see the attachment, courtesy of a work colleague. yes, it is APC. Thanks for the script! ( a lot more organized than mine:) ) but i would have a few questions: 1. given tmax of 600 , do you have the cron job at 600 seconds? 2. the T is used to see if the ups is down i imagine.. but why are you interested in collecting the ping time? (i chosen to ignore failed devices with -r 1 -t 0.5; also given the delay of snmpget i take all the list of vars with a single snmpget, redirect in a file and format the variables from the file) Thanks a lot for help!! Adrian On 25 Jan 2014, at 19:22, Adrian Sevcenco adrian.sevce...@cern.ch wrote: On 01/25/2014 08:02 PM, Bernard Li wrote: Hi Adrian: Hi! You might want to look into spoofing... it might help with what you're doing. it helps in the sense that i choose the name of package data (e.g data comes from ups1). The problem is the destination : i want to have a distinct cluster (a cluster for upses, one for raid devices, one for switches) so i have to send data to a gmond that reports nothing else. (no other metrics at all, just what i send with gmetric) if i delete all metrics i just get that no metrics defined Thanks! Adrian Cheers, Bernard On Sat, Jan 25, 2014 at 4:19 AM, Adrian Sevcenco adrian.sevce...@cern.ch wrote: On 01/24/2014 10:47 AM, Cristovao Jose Domingues Cordeiro wrote: Thank you for your replies Bernard and Shekar. Actually in the meanwhile I managed to make it work. For those who might want to do the same here's what I did: · Stop gmond and gmetad, and clean the RRD database in the collector (I was getting unspecified clusters and cached information so this was a clean start) · I have 5 clusters so I've created 5 gmond.conf files - gmond-cluster1.conf, gmond-cluster2.conf...etc. In these files I've set up the cluster name, both udp and tcp ports to the same number, and the hostname in the udp_send_channel to 'localhost' Hi! Is it possible to post your gmond settings? How did you solve the problem that the reporting node (where aggregation gmond sits) will also report the self metrics? My problem is a little different : i try to make a fake gmond for gmetric input for UPS/RAID/Switches metrics. Thanks a lot! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] multiple clusters with just one collector
On 01/27/2014 11:11 AM, Cristovao Jose Domingues Cordeiro wrote: Hi Adrian, Hi! I saw that the other guys are already helping you so there's nothing that I can add which would help you I think. Nevertheless I must say you already help me a lot! thank you!! that I don't have an aggregation gmond, that was the point of my question, not having one. I receive all the metric directly in the gmetad node. unfortunately ganglia cant have aggregating gmonds (like having gmond which gather data from various cluster gmonds and gmetad to be able to disentangle the source clusters and write the data where should belong). I tried once to propose to have the cluster attribute not as tag in xml but as actual metric/parameter that would identify cluster group but i didn't received positive feedback (alongside with having an uuid (automatically gathered or manual) to identify the host). My plan is when i have time (like spare time .. if i remember what is that :D ) to add those modifications myself and see how it goes. Anyway, thanks a lot for help :) Adrian Cumprimentos / Best regards, Cristóvão José Domingues Cordeiro From: Adrian Sevcenco [adrian.sevce...@cern.ch] Sent: 25 January 2014 13:19 To: ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] multiple clusters with just one collector On 01/24/2014 10:47 AM, Cristovao Jose Domingues Cordeiro wrote: Thank you for your replies Bernard and Shekar. Actually in the meanwhile I managed to make it work. For those who might want to do the same here's what I did: · Stop gmond and gmetad, and clean the RRD database in the collector (I was getting unspecified clusters and cached information so this was a clean start) · I have 5 clusters so I've created 5 gmond.conf files - gmond-cluster1.conf, gmond-cluster2.conf...etc. In these files I've set up the cluster name, both udp and tcp ports to the same number, and the hostname in the udp_send_channel to 'localhost' Hi! Is it possible to post your gmond settings? How did you solve the problem that the reporting node (where aggregation gmond sits) will also report the self metrics? My problem is a little different : i try to make a fake gmond for gmetric input for UPS/RAID/Switches metrics. Thanks a lot! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Which daemon install rrds? Debugging it in different ports
On 02/14/2014 08:05 AM, jupiter wrote: Hi, I've set up nodes to send data to an aggregation gmond in port 9000. In debugging I can see all nodes sent messages to the aggregation gmond in a host. 192.168.1.101, but running telnet 192.168.1.101 9000 shown an empty XML file, there are no rrds files in /var/lib/ganglia/rrds either. Where have rrds files gone? if you don't have the xml populated there will no rrds. (gmetad take the xml and save the info in rrds) Your first step would be that you receive an valid xml file from aggregation gmond. If you post here the send and recv channels of a node and of aggregator someone could help you. Also check the firewall for both TCP and UDP/9000 (gmond give xml on TCP and talk with other gmonds over UDP) HTH, Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Android apps run on BlackBerry 10 Introducing the new BlackBerry 10.2.1 Runtime for Android apps. Now with support for Jelly Bean, Bluetooth, Mapview and more. Get your Android app in front of a whole new audience. Start now. http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia 4.x architecture planning
On 03/27/2014 10:07 PM, Daniel Pocock wrote: I made up a rough diagram about how Ganglia 4.x could look: https://raw.githubusercontent.com/ganglia/monitor-core/master/doc/planning/ganglia-4.x.png The biggest change is the introduction of MongoDB Instead of having the gmetad serve up an XML every time somebody asks to see the web page, the gmetad will just store current values into MongoDB. Well, i will post again my ideas/problems with the hope that somebody more proficient in coding and with a lot more time have the same needs as i do. So (the big ones): 1. one of my biggest problems with gmond is that the cluster tagging is done at the level of xml and is not a per-host tag. this will allow to aggregate info from different nodes and grouping per cluster tag be done at storage level (gmetad) 2. for the storage back-end i would recommend postgres : we (our experiment) use a tool that write all the info in postgres (many hundredth thousands parameters/s) and the admins and developers assured me that postgres was the best choice. Some minor things: 1. It would be useful to have in mod_cpu all the info (or at least the 4-5 lines) of /proc/cpuinfo namely : processor, vendor_id, cpu family, model, model name (most important), stepping 2. instead of : Operating SystemLinux Operating System Release3.13.0-1.el6.elrepo.x86_64 (info obtained from uname i imagine) it should print info obtained from lsb_release and uname : uname -o lsb_release -irc uname -r uname -m 3. if its possible (at least with the title of experiment) to add host_uuid variable to host variables. I understood from Daniel that something like this is already in a branch Thank you! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gweb :: custom graphs for grid and cluster presentation
Hi! How can i change the default graphs that i have for each cluster (in grid view) and the reports for cluster view? (i imagine that i have to use some json but how exactly?) I have a cluster of upses where i send with gmetric variables taken from snmp. How can i change, for example, in Grid view to have the average load (metric ups.load) and total of power (metric ups.outpow)? Also, for the cluster view, is there a way to create reports and customize what is shown? Thank you! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] multicast gmond :: not receiving data from localhost but ok for all others
Hi! I have a very strange situation with a cluster ganglia that stopped receiving metrics from the principal gmond. So : all gmonds from cluster are mcast_joined to 224.0.0.3 the principal gmond is the gmond that is on the same machine with gmetad. (this principal gmond being the source of information for gmetad) At this moment gmetad have and writes rrds for ALL nodes EXCEPT the one that give the information! This would mean that gmond works but not for localhost... How is this possible? The current NOT working settings are (the head not private ip is 172.18.0.1): udp_send_channel { mcast_join = 224.0.0.3 host = 172.18.0.1 port = 8649 ttl = 1 } udp_recv_channel { mcast_join = 224.0.0.3 port = 8649 } tcp_accept_channel { port = 8649 } Does anyone have any idea about this? Thank you! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] dynamic gmond parameters
On 10/02/2015 12:54 PM, Branko Toic wrote: > I'm also interested in this. > I was planing on ruining multiple mute=yes gmond instances on single > host via docker for aggregation purposes. > Having ability to read env variables would ease up deployment. For now > I was thinking about placeholders in conf files and some sed commands > before startup The configuration file is not read by bash but it has a very powerful feature : include you can have a gmond.conf constructed with components like include ("/path/setting1") ... include ("/path/setting2") and each component you could customize with a script. I think that it would be easier and flexible tot make a sed on a component && gmond restart than to use something like env vars. The plus side would be that each component could be remote customized and deployed afterwards on the desired node. HTH, Adrian > > On Fri, Oct 2, 2015 at 10:26 AM, Cristovao Cordeiro > <cristovao.corde...@cern.ch> wrote: >> Hi, >> >> is it possible to have gmond parameters that can get environment variables >> or simply return the value of a bash expression? Something like: >> ... >> override_hostname = "%{HOSTNAME}_myname" >> ... >> (just an example) >> >> Thanks >> >> Cumprimentos / Best regards, >> Cristóvão José Domingues Cordeiro >> >> >> -- >> >> ___ >> Ganglia-general mailing list >> Ganglia-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/ganglia-general >> > > -- > _______ > Ganglia-general mailing list > Ganglia-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-general > -- -- Adrian Sevcenco, Ph.D. | Institute of Space Science - ISS, Romania| adrian.sevcenco at {cern.ch,spacescience.ro} | -- smime.p7s Description: S/MIME Cryptographic Signature -- ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gweb :: metric not found in cluster view but shown in host view
Hi! I have a very strange problem with some metrics that i push to a fake gmond in cluster view the metric is not shown : https://monitor.spacescience.ro/ganglia2/?c=ISS_UPS_Status=ups_ph1_load=halfh but in host view is shown : https://monitor.spacescience.ro/ganglia2/graph_all_periods.php?c=ISS_UPS_Status=ups-dc=halfh=default===1451386352=10980.0=ups_ph1_load=VA=Output%20Power%20-%20PH1=large any idea where the problem could be? i have another fake gmond where i push metric from some apc cooling devices and those are shown ok .. Thank you!! Adrian smime.p7s Description: S/MIME Cryptographic Signature -- ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] rrdcached /gmetad permission problems after update.
On 02/22/2016 10:32 PM, Grigory Shamov wrote: > Hi All, > > I have updated Gmond/Gmetad to 3.7.2 on our Ganglia server that uses also > RRDCached. > It used to work, and configuration didn't change, but bow metrics do not > get into the graphs anymore. > In the logs there is a lot of messages about permissions, and a new kind > of message about imuxsock thing: > > Feb 22 13:18:40 host /usr/sbin/gmetad[3554]: RRD_update > (/var/lib/ganglia/rrds/Grex/__SummaryInfo__/rx_bytes_ib0.rrd): rrdcached: > Permission denied. > Feb 22 13:18:40 host rsyslogd-2177: imuxsock begins to drop messages from > pid 3554 due to rate-limiting > > Does anyone know how to fix it? Thank you very much in advance! I had the same problem (and partially i still have one) .. So : 1. that "drop messages" message is because of the write errors .. you can ignore that 2. the problem is with rrdcached .. i tried with making part of group ganglia, change ownership of rrds to ganglia:rrdcached but had the same errors ... in the end i had to make the rrds dir 777 and i get rid of errors 3. ganglia web does not function with the rrdcached limited socket so i had to use the same full socket that gmetad use ... i hope that nothing bad will happen but i have no other choice... HTH, Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] rrdcached /gmetad permission problems after update.
On 02/23/2016 04:57 PM, Vladimir Vuksan wrote: > Another thing to try is to switch over to using TCP for rrdcached > connections since that avoids contention on the rrdcached socket and > should avoid some of the permissions issues. For example I am using > following options > > OPTS=" -t 60 -w 180 -z 180 -F -s ganglia -m 664 -l 127.0.0.1:9998 -s > ganglia -m 777 -P FLUSH,STATS,HELP -l unix:/tmp/rrdcached.limited.sock > -b /var/lib/ganglia/rrds -B -p /var/lib/ganglia/rrdcached.pid Hi! Thank you for advice! but after switching to tcp now i have a problem with this : Feb 26 18:23:55 monitor /usr/sbin/gmetad[16225]: RRD_update (/export/ganglia/rrds/__SummaryInfo__/multicpu_wio4.rrd): absolute path names not allowed when talking to a remote daemon i have this structure for options : OPTS="-b ${RRDS_DIR} -B -R -F -j ${RRDCACHED_JOURNAL_DIR} -w ${RRDCACHED_WRITE_TIMEOUT} -z ${RRDCACHED_WRITE_DELAY} -f ${RRDCACHED_WRITE_OLDTIMEOUT} \ -t ${RRDCACHED_WRITE_THREADS} \ -s ${RRDCACHED_USER} -m ${SOCKPERMS} -l ${SOCKET1} \ -s ${RRDCACHED_USER_LIM} -m 777 -P FLUSH,STATS,HELP,INFO -l ${SOCKET2}" RRDCACHED_USER* is ganglia and the PID file is set up in the init file... in the end i had to return to my old setting and because the graphs are not seen by ganglia-web if limited sock is used, i had to use the full sock.. > In the gmetad.conf you can then add > > rrdcached_address 127.0.0.1:9998 this did not worked, i had to use : RRDCACHED_ADDRESS=127.0.0.1:9998 btw : why is this setting on different file than gmetad.conf? Couldn't/shouldn't be in gmetad.conf? Thank you! Adrian > > Vladimir > > 02/23/2016 u 09:34 AM, Adrian Sevcenco je napisao/la: >> On 02/22/2016 10:32 PM, Grigory Shamov wrote: >>> Hi All, >>> >>> I have updated Gmond/Gmetad to 3.7.2 on our Ganglia server that uses also >>> RRDCached. >>> It used to work, and configuration didn't change, but bow metrics do not >>> get into the graphs anymore. >>> In the logs there is a lot of messages about permissions, and a new kind >>> of message about imuxsock thing: >>> >>> Feb 22 13:18:40 host /usr/sbin/gmetad[3554]: RRD_update >>> (/var/lib/ganglia/rrds/Grex/__SummaryInfo__/rx_bytes_ib0.rrd): rrdcached: >>> Permission denied. >>> Feb 22 13:18:40 host rsyslogd-2177: imuxsock begins to drop messages from >>> pid 3554 due to rate-limiting >>> >>> Does anyone know how to fix it? Thank you very much in advance! >> I had the same problem (and partially i still have one) .. >> So : >> 1. that "drop messages" message is because of the write errors .. you >> can ignore that >> >> 2. the problem is with rrdcached .. i tried with making part of group >> ganglia, change ownership of rrds to ganglia:rrdcached but had the same >> errors ... in the end i had to make the rrds dir 777 and i get rid of errors >> >> 3. ganglia web does not function with the rrdcached limited socket so i >> had to use the same full socket that gmetad use ... i hope that nothing >> bad will happen but i have no other choice... >> >> HTH, >> Adrian >> >> >> >> -- >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140 >> >> >> ___ >> Ganglia-general mailing list >> Ganglia-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/ganglia-general > -- -- Adrian Sevcenco, Ph.D. | Institute of Space Science - ISS, Romania| adrian.sevcenco at {cern.ch,spacescience.ro} | -- smime.p7s Description: S/MIME Cryptographic Signature -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] 3.7.2 RPMs
On 03/28/2016 06:45 PM, Damir Krstic wrote: > Does anyone have gangalia 3.7.2 RPMs available? I want to upgrade my > stateless Redhat 6.6 cluster (450+ compute nodes) to this version of > ganglia and without RPMs the process is not as straightforward as I like. it is already packaged in epel .. unfortunately the sysinit script is not very carefully done (there is not usage of pidfile) so you might want to modify it if you use multiple gmonds on the same machine.. Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785471=/4140___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] ganglia web : context of graphs (customization of pages)
Hi! I would like to customize the graph shown in the grid and cluster level .. but it is not clear to me what context should i use from my understanding i have to create json with the form of _.json in the conf directory .. i customized successfully the overview of a cluster (with cluster_) and of a host with host_ but i would like to know how can i customize the following : 1. summary of meta (i thing that meta is the main page the show all cluster) 2. in the same main page, the overview of the cluster 3. the default metric for all host in the cluster (in the cluster page) 4. in the same cluster page various hosts to have some other metric shown as default Any idea how to do this? Or at least do someone have a proper dictionary for : $context = "control"; $context = "tree"; $context = "compare_hosts"; $context = "decompose_graph"; $context = "views"; $context = "meta"; $context = "grid"; $context = "physical"; $context = "cluster-summary"; $context = "cluster"; $context = "node"; $context = "host"; Thank you!! Adrian smime.p7s Description: S/MIME Cryptographic Signature ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] gmond collection groups :: split metrics of one module to multiple collection groups
Hi! I have a module that get a bunch of metrics from a running service (in the form of an xml) is there a way to create various collection groups (with specific names) that use the same module? Thank you!! Adrian smime.p7s Description: S/MIME Cryptographic Signature ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general