Re: [Ganglia-developers] Spoofing: "Incorrect format for spoof argument. exiting."
Hi Jeff, it's best if you can submit a pull request for this issue. Thank you, Vladimir 05/31/2018 u 12:24 PM, Jeffrey Frey je napisao/la: Background == On a new cluster we are building right now I moved from Ganglia 3.6.1 to 3.7.2. 3.6.1 has been rock-solid on previous clusters. After 3.7.2 gmond has been up for a short period of time, it begins emitting the error message: Incorrect format for spoof argument. exiting. Debugging = If I enable debugging (e.g. -d 4) I'm shown the parsed contents of the spoof string -- and they are non-zero garbage strings. Doing some gdb tracing with breakpoints on that error message, the metric_id passed to the function has non-zero .spoof and the .host value is a garbage string. In one trace, the .host was an empty string (""); the code in Ganglia_host_get() assumes that if .spoof is non-zero, then .host is non-null and a string with length > 0. So the subsequent code: spoof_info_len = strlen(metric_id->host); buff = malloc(spoof_info_len+1); strncpy(buff, metric_id->host, spoof_info_len + 1); spoofIP = buff; if( !(spoofName = strchr(buff+1,':')) ){ can produce a buffer overrun for a zero-length string. To isolate possible reasons for the botched spoofing hostname I compared the gmond/gmond.c source between 3.6.1 and 3.7.2. In Ganglia_collection_group_send() the following code name = cb->msg.Ganglia_value_msg_u.gstr.metric_id.name; if (override_hostname != NULL) { cb->msg.Ganglia_value_msg_u.gstr.metric_id.host = apr_pstrcat(gm_pool, (char *)( override_ip != NULL ? override_ip : override_hostname ), ":", (char *) override_hostname, NULL); cb->msg.Ganglia_value_msg_u.gstr.metric_id.spoof = TRUE; } is allocating the callback's .host field from the temporary metrics APR pool; but the callback is external to this function and lives on beyond the destruction of that temporary APR pool. Eventually the memory behind cb->msg.Ganglia_value_msg_u.gstr.metric_id.host will be reused and overwritten, yielding the "garbage string" condition that's being observed. In 3.6.1, the .host field was allocated from global_context. If I modified the code cited above to use global_context rather than gm_pool, gmond runs without throwing "Incorrect format for spoof argument" errors. Also, in lib/libgmond.c the static global "myhost" static char myhost[APRMAXHOSTLEN+1]; is assumed by the rest of the code to have been initialized by the compiler to be a zero-length string: if (myhost[0] == '\0') apr_gethostname( (char*)myhost, APRMAXHOSTLEN+1, gm_pool); Probably best to be explicit about the initial value of myhost and not assume an initial value? static char myhost[APRMAXHOSTLEN+1] = ""; Happy to contribute patch files, etc. :: Jeffrey T. Frey, Ph.D. Systems Programmer V / HPC Management Network & Systems Services / College of Engineering University of Delaware, Newark DE 19716 Office: (302) 831-6034 Mobile: (302) 419-4976 ::
Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster
Clusters are logical grouping of like hosts. This can be e.g. per location (same data center), per app or per function (DB, web, etc.). It really depends how you are viewing your environment. There is no right or wrong way to group it. Vladimir 03/30/2017 u 04:30 AM, Guo, Jason je napisao/la: Thanks Vladimir As you mentioned, FB had clusters with tens of thousands of nodes in a cluster. How they orchestrate these nodes? Here are some options in my mind 1. All the nodes share a few centralized gmonds and all of them belong to a single cluster (the cluster concept in ganglia) 2. All the nodes share a few centralized gmonds and each centralized gmond belong to different cluster, and there is a single gmetad which poll data from these centralized gmond 3. There are multiple gmetad/grid and then orchestrate these grids with a centralized gmetad/grid\ Thanks & Best Regards, Jason Guo From: Vladimir Vuksan <vli...@veus.hr> Date: Wednesday, March 29, 2017 at 20:09 To: "Guo, Jason" <ju...@ebay.com>, "ganglia-developers@lists.sourceforge.net" <ganglia-developers@lists.sourceforge.net> Subject: Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster Hi Jason, it depends on the number of metrics and associated metadata in the cluster and how busy gmetad is overall. Also depends on your hardware. At one point FB had clusters with tens of thousands of nodes in a cluster. Try to keep your metrics lean ie. don't add any metric descriptions if you don't have to so to keep the XML payload small and it should be fine. Vladimir 3/28/2017 u 10:19 PM, Guo, Jason je napisao/la: Hi, I’m writing this mail to discuss whether Ganglia works well for a large-scale cluster (more than 4000 nodes). As per Ganglia document, ganglia can scale to handle clusters with 2000 nodes. So many people have concern on using Ganglia for a 4000 nodes production cluster. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes. If the cluster is large than 2000 nodes, say 4000 nodes, can Ganglia handle it properly? To verify this, I create a 5000 nodes ganglia cluster on top of Docker cluster (10 machine). I put 500 nodes in a cluster, so there are 10 cluster. And these 10 clusters are in the same Grid. For each gmond, I use a script to generate 30 customized metrics (with gmetric). Currently it works fine in the Docker based test environment. So, my question is whether Ganglia is suitable for 4000 nodes cluster? -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster
Hi Jason, it depends on the number of metrics and associated metadata in the cluster and how busy gmetad is overall. Also depends on your hardware. At one point FB had clusters with tens of thousands of nodes in a cluster. Try to keep your metrics lean ie. don't add any metric descriptions if you don't have to so to keep the XML payload small and it should be fine. Vladimir 3/28/2017 u 10:19 PM, Guo, Jason je napisao/la: Hi, I’m writing this mail to discuss whether Ganglia works well for a large-scale cluster (more than 4000 nodes). As per Ganglia document, ganglia can scale to handle clusters with 2000 nodes. So many people have concern on using Ganglia for a 4000 nodes production cluster. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes. If the cluster is large than 2000 nodes, say 4000 nodes, can Ganglia handle it properly? To verify this, I create a 5000 nodes ganglia cluster on top of Docker cluster (10 machine). I put 500 nodes in a cluster, so there are 10 cluster. And these 10 clusters are in the same Grid. For each gmond, I use a script to generate 30 customized metrics (with gmetric). Currently it works fine in the Docker based test environment. So, my question is whether Ganglia is suitable for 4000 nodes cluster? Thanks & Best Regards, Jason Gu0o -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] custom start/end time not working
Hi Jagga, indeed that is a bug. We'll need to fix it. Vladimir 07/05/2016 u 01:40 PM, Jagga Soorma je napisao/la: > I just learned that I can append a =5 to that missing stacked > image and I see the following rrdtool command being passed: > > /usr/bin/rrdtool graph - -E --start -s --end N --width 700 --height > 300 --title 'rescomp aggregated load_one last custom' --upper-limit > '0' --lower-limit '0' > > Looks like it is missing the {start/end} range. This confirms that > the the custom time range is not being passed correctly. Anyone know > who I can go about fixing this? > > Thanks. > > On Tue, Jul 5, 2016 at 10:15 AM, Jagga Soormawrote: >> Hi Guys, >> >> My apologies if posting to this list is not appropriate but I did not >> get anywhere on the other list so thought I would post here. I am >> running ganglia version 3.6.0-1 and the web version 3.6.2. Looks like >> if I use the predefined time tabs for "Last - >> hour/2hr/4h/day/week/month/year/job" my stacked graph shows up just >> fine. However if I add a custom time range with the start time being >> before the end time the stacked graph does not display and I get the >> following in the apache error logs: >> >> ERROR: start time: There should be number after '-' >> >> The link to the broken image shows: >> >> http://{ganglia-web}/stacked.php?m=load_one=rescomp=custom=1467142510_regex= >> >> Looks like potentially the correct start time is not being generated >> from the web ui and passed to rrdtool? Am I missing something here? >> Has anyone seen this issue and if so how can I fix it? >> >> Thanks in advance for your help with this! > -- > Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San > Francisco, CA to explore cutting-edge tech and listen to tech luminaries > present their vision of the future. This family event has something for > everyone, including kids. Get more information and register today. > http://sdm.link/attshape > ___ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Web 3.7.1 released
Ganglia Web 3.7.1 has been released. It can be downloaded from https://sourceforge.net/projects/ganglia/files/ganglia-web/3.7.1 Major changes - Fix for auth bypass when using the authentication module - Fix for a XSS in the view adding interface - Update JQuery Mobile library to 1.4.5 Please update as soon as practicable. Vladimir -- ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Metric Packing Patch Proposal
Hi Nikhill, This definitely sounds very interesting. I'd love to see it. As far as other features I'd love to see some payload encryption e.g. possibly http://nacl.cr.yp.to/ :-) Vladimir 07/24/2015 u 02:34 PM, Nikhill Rao je napisao/la: Hello all: We are in the process of upgrading our Ganglia installations at Quantcast to version 3.7.1. Currently, we use a heavily modified version of 3.0.4 which incorporates support for packing multiple metrics into a single UDP packet, as well as adding a timestamp to the packet before before being sent. These custom-formatted packets are based on the old 3.0 format. I am working on a patch for gmond to be able to accept and emit our packed packet format as well as the 3.7.1 format. Gmond would be able to accept both kinds of packets and emit either format based on a config option. It should also be fairly simple to re-implement our packing logic to allow for the packing of 3.7-style packets as well. Is there any interest in this patch upstream, and if so, what other features would you all like to see? -- ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia 3.7.2 pre-release
I have cut Ganglia 3.7.2. It contains a fix for a memory leak if override_hostname or override_ip are used. This is the fix in question https://github.com/ganglia/monitor-core/commit/a6f5a2874709f4a0b6df0b015699f523e7d73def We have been running the fix in production and it appears to be working correctly. Pre-release is available at http://sourceforge.net/projects/ganglia/files/pre-release/ganglia-3.7.2.tar.gz/download If I don't hear any compaints by next week I will promote it to official release. Vladimir -- Don't Limit Your Business. Reach for the Cloud. GigeNET's Cloud Solutions provide you with the tools and support that you need to offload your IT needs and focus on growing your business. Configured For All Businesses. Start Your Cloud Today. https://www.gigenetcloud.com/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Ganglia-Web 3.7.0 released - includes security fixes
I will look into correcting this however in my initial reading there is extremely low amount of risk here. Ganglia Web uses cookies only for things like - Aggregate graphs input field arguments e.g. host regex, metric regex - Which Tab you have open etc. There is no risk on session hijack as we do not use cookies for authentication. Vladimir On 05/29/2015 02:37 AM, Cristovao Cordeiro wrote: Hi, I think I've sent an email about this many months ago. Now after the update, this is the output from skipfish: Summary: The application is missing the 'httpOnly' cookie attribute Vulnerability Detection Result: The cookies ... are missing the httpOnly attribute. Impact: Application Solution: Set the 'httpOnly' attribute for any session cookies. Affected Software/OS: Application with session handling in cookies. Vulnerability Insight: The flaw is due to a cookie is not using the 'httpOnly' attribute. This allows a cookie to be accessed by _javascript_ which could lead to session hijac! king attacks. Vulnerability Detection Method: Check all cookies sent by the application for a missing 'httpOnly' attribute Details: Missing httpOnly Cookie Attribute Thanks Cumprimentos / Best regards, Cristóvão José Domingues Cordeiro From: Vladimir Vuksan [vli...@veus.hr] Sent: 28 May 2015 22:57 To: Cristovao Cordeiro; ganglia-developers@lists.sourceforge.net; Ganglia Subject: Re: [Ganglia-general] Ganglia-Web 3.7.0 released - includes security fixes Is there an issue open for this and what are the details ? Vladimir On 05/28/2015 04:40 AM, Cristovao Cordeiro wrote: Hi all, was this issue addressed: NVT: Missing httpOnly Cookie Attribute OID: 1.3.6.1.4.1.25623.1.0.105925 Threat: Medium (CVSS: 5.0) Port: 80/tcp Because after updating I still have it. Any idea on how to solve it? Thanks Cumprimentos / Best regards, Cristóvão José Domingues Cordeiro IT Department - 28/R-018 CERN From: Vladimir Vuksan [vli...@veus.hr] Sent: 21 May 2015 20:22 To: ganglia-developers@lists.sourceforge.net; Ganglia Subject: [Ganglia-general] Ganglia-Web 3.7.0 released - includes security fixes Hi all, Ganglia Web 3.7.0 has been released. Major highlights are Cubism integration https://github.com/ganglia/ganglia-web/wiki/Cubism-integration Ganglia Reporting https://github.com/ganglia/ganglia-web/wiki/Ganglia-Reports Couple reported XSS issues have been corrected If you are running Ganglia Web on a publicly accessible server you are strongly advised to upgrade ASAP. You can download latest release from here https://sourceforge.net/projects/ganglia/files/ganglia-web/ Installation instructions can be found here https://github.com/ganglia/ganglia-web/wiki#Installation
Re: [Ganglia-developers] [Ganglia-general] Ganglia-Web 3.7.0 released - includes security fixes
Thanks Jack. I have integrated your changes into the installation Wiki. Vladimir On 05/29/2015 05:16 AM, linu...@linux.vnet.ibm.com wrote: Vladimir and all: Since it's not easy to setup the env of ganglia webfrontend, I tried to add a trouble-shooting part for the wikipage of ganglia-web as following: == Trouble shooting == * you need to copy `/var/www/ganglia2/apache.conf` (Ubuntu/Debian) or `/var/www/html/ganglia2/apache.conf` (CentOS/RHEL) to `/etc/apache2/sites-enabled`. * In most cases, you need to modify the above apache.conf to make sure the alias /ganglia refers to `/var/www/ganglia2` (Ubuntu/Debian) or `/var/www/html/gangla2` (CentOS/RHEL) . * In most cases, you need to modify `/var/www/ganglia2/conf_default.php` (Ubuntu/Debian) or `/var/www/html/ganglia2` (CentOS/RHEL) to make sure `gweb_confdir` refers to the directory where the directories of `conf` and `dwoo` locate in, such as `/var/lib/ganglia-web` or `/var/lib/ganglia`. * Make sure you have the dir of rrds under `gmetad_root`. * Make sure the above rrds dir should be owned by the user of `nobody`. If you guys think this is not bad, how could I push it into the wikipage? Seems that's not the same process as to submit a patch to the sourcecode. Thank you, -jack -- ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Ganglia-Web 3.7.0 released - includes security fixes
Is there an issue open for this and what are the details ? Vladimir On 05/28/2015 04:40 AM, Cristovao Cordeiro wrote: Hi all, was this issue addressed: NVT: Missing httpOnly Cookie Attribute OID: 1.3.6.1.4.1.25623.1.0.105925 Threat: Medium (CVSS: 5.0) Port: 80/tcp Because after updating I still have it. Any idea on how to solve it? Thanks Cumprimentos / Best regards, Cristóvão José Domingues Cordeiro IT Department - 28/R-018 CERN From: Vladimir Vuksan [vli...@veus.hr] Sent: 21 May 2015 20:22 To: ganglia-developers@lists.sourceforge.net; Ganglia Subject: [Ganglia-general] Ganglia-Web 3.7.0 released - includes security fixes Hi all, Ganglia Web 3.7.0 has been released. Major highlights are Cubism integration https://github.com/ganglia/ganglia-web/wiki/Cubism-integration Ganglia Reporting https://github.com/ganglia/ganglia-web/wiki/Ganglia-Reports Couple reported XSS issues have been corrected If you are running Ganglia Web on a publicly accessible server you are strongly advised to upgrade ASAP. You can download latest release from here https://sourceforge.net/projects/ganglia/files/ganglia-web/ Installation instructions can be found here https://github.com/ganglia/ganglia-web/wiki#Installation Vladimir -- ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia server is running under high server load and high Disk IO
How many total metrics are you keeping track of ? Also what Ganglia version is this ? 40% wait IO seems kind of excessive especially with SSD. Vladimir 05/22/2015 u 02:50 AM, Ramesh Kumar je napisao/la: Hello, My ganglia server is running under high load and showing too much IO. It stops showing webpage. Just the blank page. However apache and gmetad is running. I have to restart the gmetad service, then it start showing the web page again. It shows iowait time over 40 all the time and server load around 3. I am graphing around 130-150 servers. It contains around 50 hadoop nodes. I am not using any RRDcache or memcached on my server. Please suggest me if there are any ganglia optimization tips or documentation available. My server configuration is: 16GB RAM 8 CPU's 600GB SSD Hard Drive with 3000 IO/sec support -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia-Web 3.7.0 released - includes security fixes
Hi all, Ganglia Web 3.7.0 has been released. Major highlights are Cubism integration https://github.com/ganglia/ganglia-web/wiki/Cubism-integration Ganglia Reporting https://github.com/ganglia/ganglia-web/wiki/Ganglia-Reports Couple reported XSS issues have been corrected If you are running Ganglia Web on a publicly accessible server you are strongly advised to upgrade ASAP. You can download latest release from here https://sourceforge.net/projects/ganglia/files/ganglia-web/ Installation instructions can be found here https://github.com/ganglia/ganglia-web/wiki#Installation Vladimir -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] about how to develop Ganglia source code
Here are build instructions https://github.com/ganglia/monitor-core/wiki/BuildingARelease Vladimir On 05/14/2015 11:03 PM, linu...@linux.vnet.ibm.com wrote: Hi, pablo: My point is if I want to contribute to ganglia-3.7.1/configure or ganglia-3.7.1/build, how to do that? Seems they are not on github.com/ganglia. Seems some guy is maintaining these stuff, who are they? how to reach them? Thanks, -jack Quoting pabloa98 pablo...@gmail.com: The last version of the sourcecode lives in github.com/ganglia On Wed, May 13, 2015 at 1:13 AM, linu...@linux.vnet.ibm.com wrote: Hi, all: I am new to the development of ganglia source code. I saw that ganglia will release it's new versions through Sourceforge such as ganglia-3.7.1. In this package, it will include some sub-projects such as gmond, gmetad, etc.. As the source tree of ganglia is on github.com/ganglia, I don't know how to match the source released from sourceforge with the source tree on github. Or, in other words, how to compile the development code out? Seems lots of the released files can be found under ganglia/monitor-core, but how about the other files? How they are maintained? Such as configure, the build dir. Thanks, -jack -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Blank graphs in Views
Which version of Ganglia Web are you using ? Vladimir On 04/14/2015 06:32 PM, Ramesh Kumar wrote: Hello, I have created views and added some of the graphs in those views but all the graphs in views are blank and on the bottom it says “No Matching metrics detected”. Anyone know why? Or Am I missing something here? -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Core 3.7.1 Released
Ganglia team is happy to announce release 3.7.1 of Ganglia core. Major changes in this release are * Hash table in gmetad has been reworked to support much higher metric counts and larger number of metrics * A number of GMond python modules have been rewritten and enhanced You can download the latest release at https://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/3.7.1/ Vladimir -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia 3.7.1 pre-release
Hello all, Since we had issues with 3.7.0 and inability to get concurrency toolkit stuff built we will be skipping releasing it. This was changed with this commit https://github.com/ganglia/monitor-core/commit/c038b5c0ac0bb51bf0998a6299a8741c1c456fda to make it use APR locks. I have been running that version for a number of weeks in production without any problems. I have thus tagged 3.7.1 and made a release. Tar ball is available in pre-release area https://sourceforge.net/projects/ganglia/files/pre-release/ If I don't hear any complaints by Monday March 30th, 2015 I will promote 3.7.1 to be our latest version. Vladimir -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond python modules location
Thanks Chris for doing this. Vladimir On 02/09/2015 10:05 AM, Chris Burroughs wrote: I think this mostly works out: * https://github.com/ganglia/monitor-core/pull/183 * https://github.com/ganglia/gmond_python_modules/pull/184 On 02/06/2015 09:35 AM, Chris Burroughs wrote: That makes sense. For me in the import short term the thing is not manually trying to keep dupe code in sync. I'll start work on reconciling. On 02/05/2015 02:45 PM, Dave Rawks wrote: IIRC at least some of the motivation behind the current split had to do with separating out modules which aren't generally platform agnostic since platform specific stuff sometimes causes problems with downstream linux distro adoption/packaging. I.E. the official python modules should be expected to work on the full set of platforms that gmond works on. -Dave -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.7.x release status
I feel that problems with packaging Concurrency Kit are solvable. If there are alternate implementations that can be selected at runtime and someone is willing to contribute code to do it I am all for it however without Concurrency Kit Ganglia is unusable for a number of larger installations and implementations. Vladimir On 10/30/2014 09:05 PM, J.T. Conklin wrote: Daniel Pocock dan...@pocock.pro writes: - 3.7 adds a new dependency, Concurrency Kit Given the problems you describe with Concurrency Kit, is it simply premature for ganglia to depend on it? From what I can tell, its use is isolated to lib/hash.[ch], and at first glance looks like it an alternate reader/writer lock implementation could be selected at runtime. Are the performance benefits of CK rwlocks really worth the added complexity, especially as ganglia continues to depend on APR, which also provides a rwlock implementation? --jtc -- ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Not all custom metrics are being displayed.
Hi Dan, when you say metrics are not show what is the output ? Can you take a screenshot of the output. Feel free to blacken out any sensitive areas. Can you make sure that gmond that is aggregating metrics (one that gmetad talks to) has the metrics ie. nc localhost 8651 | grep "METRIC NAME" Vladimir On 09/23/2014 11:16 AM, Dan (Daniel) Arsenault wrote: I am using Ganglia 3.6. I am using Python to create custom metrics. I am using a loop to generate the metrics with unique names. I appears that there is a limit that is preventing all the metrics to be shown on gweb. There is no problem showing a smaller number of metrics (around 50 or so). But when the metric count goes beyond that (around 90-100), the remaining metrics are not shown. I found on the web that there is/was a num_custom_metrics parameter for gmond.conf, but this doesnt seem to be valid anymore. Is there a configuration parameter for this? Im running gmond/gmetad with debug 10 and dont see any errors. -- Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmetad segfaulting and spinning 100% cpu on all threads
If you can provide the core file of the segfault that would be helpful. Problem may be with metric summarization where it's locking. One thing to do is provide a list of metrics to summarize e.g. I got these options turned on in gmetad.conf summarized_metrics bytes_in bytes_out pkts_in pkts_out cpu_aidle cpu_idle cpu_intr cpu_nice cpu_num cpu_sintr cpu_speed cpu_steal cpu_system cpu_user cpu_wio summarized_metrics mem_buffers mem_cached mem_dirty mem_free mem_mapped mem_shared mem_total load_one as well as metrics that we are never gonna summarize e.g. unsummarized_metrics varnish rx tx ethtool mysql bird diskstat ipmi tcpext tcp et vm ipmi disk icmp procstat icmpmsg ap apssl ip management1 Also can you tell me how your rrdcached is configured ? We have switched to TCP e.g. in gmetad.conf we have rrdcached_address 127.0.0.1:9998 and rrdcached is started via /usr/bin/rrdcached -t 60 -w 360 -z 360 -F -s ganglia -m 664 -l 127.0.0.1:9998 -s ganglia -m 777 -P FLUSH,STATS,HELP -l unix:/tmp/rrdcached.limited.sock -b /var/lib/ganglia/rrds -B -p /var/lib/ganglia/rrdcached.pid -p /var/lib/ganglia/rrdcached.pid Let me know if this helps. Vladimir P.S. Our system does over 700k metrics with those settings On 09/10/2014 08:52 PM, Gary Barrueto wrote: We're seeing gmetad (version 3.7.0) threads segfaulting or gmetad and all its threads spinning 100% cpu. The system is running ubuntu 12.04/precise. Our ganglia has 48 data source, 1300 hosts and has around 220k metrics. We are running this with rrdcached. Would like to fix it from segfault and spin on cpu. I'm including a backtrace of when gmetad and threads were spinning on cpu. Is there anything I can provide to help figure this out? gmetad.conf looks like this: RRAs "RRA:AVERAGE:0.5:1:5856" "RRA:AVERAGE:0.5:24:1680" "RRA:AVERAGE:0.5:168:244" "RRA:AVERAGE:0.5:672:274" "RRA:AVERAGE:0.5:5760:3740" server_threads 10 case_sensitive_hostnames 1 and includes 48 data_source tags and each has 2 data sources. ganglia was ./configured using: ./configure --host=x86_64-linux-gnu --build=x86_64-linux-gnu --prefix=/usr --mandir=${prefix}/share/man --libdir=${prefix}/lib --sysconfdir=/etc/ganglia --infodir=${prefix}/share/info --enable-shared --with-gmetad --with-memcached --enable-debug Backtrace: GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: http://bugs.launchpad.net/gdb-linaro/... Reading symbols from /usr/sbin/gmetad...done. [New LWP 26916] [New LWP 26917] [New LWP 26918] [New LWP 26919] [New LWP 26920] [New LWP 26921] [New LWP 26922] [New LWP 26923] [New LWP 26924] [New LWP 26925] [New LWP 26926] [New LWP 26927] [New LWP 26928] [New LWP 26929] [New LWP 26930] [New LWP 26931] [New LWP 26932] [New LWP 26933] [New LWP 26934] [New LWP 26935] [New LWP 26936] [New LWP 26937] [New LWP 26938] [New LWP 26939] [New LWP 26940] [New LWP 26941] [New LWP 26942] [New LWP 26943] [New LWP 26944] [New LWP 26946] [New LWP 26947] [New LWP 26948] [New LWP 26949] [New LWP 26950] [New LWP 26951] [New LWP 26952] [New LWP 26953] [New LWP 26954] [New LWP 26955] [New LWP 26956] [New LWP 26957] [New LWP 26958] [New LWP 26959] [New LWP 26960] [New LWP 26961] [New LWP 26962] [New LWP 26963] [New LWP
[Ganglia-developers] Ganglia web 3.6.2 released
Ganglia Web 3.6.2 has been released. Blog post can be found here http://ganglia.info/?p=604 Download it from https://sourceforge.net/projects/ganglia/files/ganglia-web/3.6.2/ Release notes are here https://github.com/ganglia/ganglia-web/wiki/Release-Notes Vladimir -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Issues with python-gmetad
What is your gmetad.conf. I suspect you have gmonds from the same cluster on separate data_source lines. Vladimir On 07/31/2014 07:21 PM, Mayank Gupta wrote: Hi All, I am trying to move from gmetad to python-gmetad with keeping gmond and other components of Ganglia as is. I am able to successfully install and run gmetad but when I try to use the plugins rrd_plugin.py rrd_summary_plugin.py I get following error in the logs and my rrd files have all the values as NAN -- INFO Error updating rrd /ngs5/app/test/ganglia/rrds/it/metric.rrd -/ngs5/app/test/ganglia/rrds/it/metric.rrd illegal attempt to update using time 1406847119 when last update time is 1406847119 (minimum one second step) i This is coming for all the metrics and for all the servers. Please let me know if I am doing something real wrong. -- Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future. http://pubads.g.doubleclick.net/gampad/clk?id=153845071iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future. http://pubads.g.doubleclick.net/gampad/clk?id=153845071iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Bug#743751: ck: FTBFS on i386: bytelock check fails
I have not tested 3.7.0 with newer CK but I am not opposed to upgrading. I don't believe it will cause problems. Vladimir On 04/06/2014 04:45 PM, Daniel Pocock wrote: Ganglia builds are currently using v0.3.5 of CK This version is troublesome in the i386 and ARM builds on Debian buildd machine. This impacts the availability of the CK dependency on both Debian and Ubuntu. Newer versions exist - has anybody tested a newer CK version with Ganglia? Is there any enthusiasm for using a newer version or any known reason not to do so? The travis build is hardcoded to ck-0.3.5 but I will shortly tweak it to follow whatever version is available in Debian sid On 06/04/14 02:42, Aaron M. Ucko wrote: Source: ck Version: 0.3.5-1 Severity: important Justification: fails to build from source The i386 build of ck failed the bytelock check: https://buildd.debian.org/status/fetch.php?pkg=ckarch=i386ver=0.3.5-1stamp=1396650656 [ Testing bytelock make[3]: Entering directory `/«PKGBUILDDIR»/regressions/ck_bytelock/validate' ./validate 8 1 Creating threads (mutual exclusion)...done Waiting for threads to finish correctness regression...ERROR [RD:110]: 8 != 0 make[3]: *** [check] Error 1 make[3]: Leaving directory `/«PKGBUILDDIR»/regressions/ck_bytelock/validate' FWIW, the kfreebsd-i386 build had no such problem; however, it ran the test with a different first argument (CPU core count?): https://buildd.debian.org/status/fetch.php?pkg=ckarch=kfreebsd-i386ver=0.3.5-1stamp=1396650149 [ Testing bytelock make[3]: Entering directory `/«PKGBUILDDIR»/regressions/ck_bytelock/validate' ./validate 2 1 Creating threads (mutual exclusion)...done Waiting for threads to finish correctness regression...done (passed) make[3]: Leaving directory `/«PKGBUILDDIR»/regressions/ck_bytelock/validate' Could you please take a look? Thanks! -- ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Events with MySQL backend; PHP error
I would just fix it :-). On 04/18/2014 11:36 AM, Maciej Lasyk wrote: Hi huys, I'm trying to set up MySQL as the backend for Events in Ganglia-Web. I have proper PEAR libs installed (MDB2, DB) via `pear install` on this Debian 7 box. Afterwards I pushed: $conf['overlay_events_provider'] = mdb2; $conf['overlay_events_dsn'] = mysql://ganglia:pwd@localhost/ganglia; To the conf.php and after restarting Apache I got in error-logs: PHP Fatal error: Call to undefined method MDB2_Error::quote() in /somewhere/ganglia-web-3.7.0/lib/Events/Driver_Mdb2.php on line 54 What's weird - I checked and there is this method in /usr/share/php/MDB2.php in the MDB2 class: function quote($value, $type = null, $quote = true, $escape_wildcards = false) { $result = $this-loadModule('Datatype', null, true); if (PEAR::isError($result)) { return $result; } return $this-datatype-quote($value, $type, $quote, $escape_wildcards); } And the calling part in lib/Events/Driver_Mdb2.php: function ganglia_events_get( $start = NULL, $end = NULL ) { global $conf; $db = MDB2::factory( $conf['overlay_events_dsn'] ); if (DB::isError($db)) { api_return_error($db-getMessage()); } $sql = SELECT * FROM overlay_events ; if ( $start != NULL || $end != NULL ) { $sql .= WHERE ; $clauses = array(); if ( $start != NULL ) { $clauses[] = start_time = . $db-quote( $start, 'integer' ); # this line } if ( $end != NULL ) { Do you have any clue, or should I rather clone / fix and create pull request? ;) -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ganglia-web package at risk
That would be fine with me if that is what it takes. Include the full blown Jquery UI. Thanks, Vladimir On 03/03/2014 01:25 PM, Daniel Pocock wrote: On 04/02/14 14:56, Daniel Pocock wrote: On 04/02/14 14:47, Chris Burroughs wrote: I thought the distro anti-bundling stance was paired with a we already have X so you should just depend on it. I'm not sure how this works with javascript. Is there some debian jquery package that could be depended on? There is a jQuery package in Debian, but it is a slightly older version There are various issues that motivate these rules/policies in distributions: - disk space - security updates (better to just have one copy of X to update in one shot, hard to find multiple bundled copies of X and check they all have the latest/necessary security patches) - source - bundling any minified artifact is not consider to be real source code That said, given that every project seems to depend on a different version of jQuery, there is some leniency - Debian accepts bundled copies of some things like jQuery as long as they are not minified. It is perfectly OK to minify them in an installation script, but the source tarball from the Ganglia web site must be 100% readable source code. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=736104 I had a quick look at this and found that the jquery-ui stuff is not cleanly available as source because of the way it is built as a custom JavaScript file using the tool here: https://jqueryui.com/download so it is not a quick fix for me to simply drop in uncompressed JavaScript. What can be done is that instead of using the custom method to get jquery-ui, perhaps the full source from here: https://jqueryui.com/resources/download/jquery-ui-1.10.4.zip can be downloaded into the ganglia-web repository (including both the minified and the human readable version) and then the full minified .js file (rather than a custom.min.js file) can be used within ganglia-web Are the ganglia-web developers happy to support that version of jquery-ui? Is there any reason the custom version has to be used? The package has now taken the first step towards being completely dropped from Debian and Ubuntu: http://packages.qa.debian.org/g/ganglia-web.html so it is important that we agree on a solution for 3.5.13 or it will be completely missing from the upcoming Ubuntu trusty release and the Debian 8 release early next year. Regards, Daniel -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ganglia-web package at risk
Let's stick with 1.10.2. Vladimir On 03/03/2014 03:13 PM, Daniel Pocock wrote: On 03/03/14 21:08, Vladimir Vuksan wrote: That would be fine with me if that is what it takes. Include the full blown Jquery UI. I see there is 1.10.2 right now Can I just swap from the custom.min.js file to the full min.js file? Or do you want to try the latest, 1.10.4, before releasing web 3.5.13? -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia and Nagios integration in our github repo
Works for me. Now we just need a volunteer :-D On 03/03/2014 04:23 PM, Nick Satterly wrote: Why don't we convert the sourceforge trac wiki in to Jekyll pages and serve them via Github? --Nick. On Mon, Mar 3, 2014 at 2:05 AM, Vladimir Vuksan vli...@veus.hr wrote: I like the fact that Github Wiki's are just another Git repo. Perhaps we ought to figure out if we can use github pages to serve Wiki's directly e.g. something like wiki.ganglia.info Anyone know ? Vladimir On 03/02/2014 04:01 AM, Daniel Pocock wrote: On 01/03/14 22:54, Ben Hartshorne wrote: I've changed my mind about reorganizing the repo. The correct way to summarize different approaches to the same problem is through documentation (aka the wiki) not repository organization. It makes sense to keep self contained projects (like ganglia-nagios-bridge) as its own repo for ease of forks and PRs and so on. I've made changes to both wiki pages Daniel mentions in the hope of making it easier for folks that are looking to solve this problem. All they need now is a little google juice; these pages are impossible to find via google. There are many other pages talking about nagios and ganglia that rank higher. Any suggestions? There are still trac wikis about too It is a real hassle In Github, we can actually disable the wiki feature and refer everybody back to track if that is easier to manage On the other hand, github wikis have the github ACLs If we are going to keep the github wikis, we need to modify the menu links on ganglia.info to link into the right places, etc -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- gpg: using PGP trust model pub 4096R/1EE38BD9 2013-01-06 [expires: 2018-01-06] Key fingerprint = 3EE9 550D D9D8 DB65 58C2 B58D CE78 EC6C 1EE3 8BD9 uid Nicholas Satterly (Debian Key) nfsatte...@gmail.com sub 4096R/23804EE9 2013-01-06 [expires: 2018-01-06] -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large
Re: [Ganglia-developers] Ganglia and Nagios integration in our github repo
I like the fact that Github Wiki's are just another Git repo. Perhaps we ought to figure out if we can use github pages to serve Wiki's directly e.g. something like wiki.ganglia.info Anyone know ? Vladimir On 03/02/2014 04:01 AM, Daniel Pocock wrote: On 01/03/14 22:54, Ben Hartshorne wrote: I've changed my mind about reorganizing the repo. The correct way to summarize different approaches to the same problem is through documentation (aka the wiki) not repository organization. It makes sense to keep self contained projects (like ganglia-nagios-bridge) as its own repo for ease of forks and PRs and so on. I've made changes to both wiki pages Daniel mentions in the hope of making it easier for folks that are looking to solve this problem. All they need now is a little google juice; these pages are impossible to find via google. There are many other pages talking about nagios and ganglia that rank higher. Any suggestions? There are still trac wikis about too It is a real hassle In Github, we can actually disable the wiki feature and refer everybody back to track if that is easier to manage On the other hand, github wikis have the github ACLs If we are going to keep the github wikis, we need to modify the menu links on ganglia.info to link into the right places, etc -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond segfault with libpython
Jeff, RPMS-6 are the Centos 6 RPMS. RPMS/ are Centos 5 RPMS. Sorry about the confusion. On 02/10/2014 04:33 PM, Jeff Layton wrote: I'm leaning this way :) I think things have gotten too screwed up (to use a technical term) and there are problems. The thing I'm concerned about is that the epel repo only has version 3.1.7 (seems pretty darn old to me). I want something newer and I want the new web interface. [root@home4 ganglia]# yum list all | grep -i ganglia ganglia.i686 3.1.7-6.el6epel ganglia.x86_64 3.1.7-6.el6epel ganglia-devel.i686 3.1.7-6.el6epel ganglia-devel.x86_64 3.1.7-6.el6epel ganglia-gmetad.x86_64 3.1.7-6.el6epel ganglia-gmond.x86_64 3.1.7-6.el6epel ganglia-gmond-python.x86_64 3.1.7-6.el6epel ganglia-web.x86_64 3.1.7-6.el6epel libnodeupdown-backend-ganglia.x86_64 1.14-1.el6 epel I'm going to try Vladimir's rpm's first but they look really old to me (May 7 2013) which is before Centos 6.5 was out. I may be hitting the mailing list again this evening (I'm writing an article about ganglia that is due in 2 days so I need to finish quickly). Thanks! Jeff At this point I suggest you - wipe out that Ganglia installation and just use Epel repo - it has everything you need (gmond, gmetad, ganglia-gmond-python). That blog you was basing on is terrible. It's very bad to make install without creating packages - no one should do this. Moreover - this installation is not based on any good filesystem hierarchy standard. Configuration files in /usr/local? Editing ld.so.conf instead of creating file in ld.so.conf.d? Those are really bad practices that lead guys to situations like yours. Epel repo is very good, stable and secure. You can easily use it instead of creating your own packages. And if you really have to - use rpmbuild or https://github.com/jordansissel/fpm And try installing Centos minimal at first - without any additional packages. It really makes things simple :) This segfault looks like some Python version problem; maybe you have more than one Python installed or maybe you have some issues with Python libraries. It's really hard to find sometimes - I would suggest you cleaning this installation and starting over using packages. On Mon, Feb 10, 2014 at 10:54:35AM -0500, Jeff Layton wrote: The only thing in /usr/local/etc/conf.d/ is modpython.conf. Given your guidance I think I've figured things out (I think). It does appear that the python modules get loaded twice (actually 3 times in my case). The time is in gmond.conf where I have it in the modules section: modules { module { name = core_metrics } module { name = python_module path = /usr/local/lib64/ganglia/modpython.so params = /usr/local/lib64/ganglia/python_modules/ } ... } At the end of /etc/ganglia/gmond.conf I have two include lines: include (/usr/local/etc/conf.d/*.conf) include('/etc/ganglia/conf.d/*.pyconf') The first line includes the file /usr/local/etc/conf.d/modpython.conf. This file has the following lines: [root@home4 ganglia]# more /usr/local/etc/conf.d/modpython.conf /* params - path to the directory where mod_python should look for python metric modules the pyconf files in the include directory below will be scanned for configurations for those modules */ modules { module { name = python_module path = modpython.so params = /usr/local/lib64/ganglia/python_modules } } include (/etc/ganglia/conf.d/*.pyconf) So it looks like the python modules get loaded 3 times (once for the first include, a second time for the include line in the file /usr/local/etc/conf.d/modpython.conf, and then a third time for the second include line in gmond.conf. Therefore, I erased the module lines in gmond.conf so that I don't load them. I also erased the include line at the end of gmond.conf pointing to /etc/ganglia/conf.d/*.pyconf. The only include line in gmond.conf is the following: include (/usr/local/etc/conf.d/*.conf) You can find my current gmond.conf file here: http://pastebin.com/FJ2WAC4D In the file /usr/local/etc/conf.d/modpython.conf, I commented out the last line which is an include line pointing to /etc/ganglia/conf.d/*.pyconf. The file now simply reads: /* params - path to the directory where mod_python should look for python metric modules the pyconf files in the include directory below will be scanned for configurations for those modules */ modules { module { name = python_module path = modpython.so params = /usr/local/lib64/ganglia/python_modules } } I think all of this means that python modules only get loaded once when it gmond.conf does the include that points to
Re: [Ganglia-developers] UDP / mute=yes and high cpu usage
Maciej, can you post top 100 lines or so of your config ie. with all the udp channels etc. Thanks On 02/07/2014 08:52 AM, Maciej Lasyk wrote: Hi guys, I've been struggling with very high cpu usage of my gmond daemons lately. I've been using UDP unicast topology. I just couldn't find the source of my problem. All my gmonds on my servers were generating ~99-100% of procs usage. stracing gmond processes revealed zounds of epools and gettimeofdays syscalls (like very many every second) and that was all. I tried to start from scratch - there was no problem when using default config (mutlicast topo, deaf/mute=no). So after playing a while I thought that _maybe_ deaf=no on gmond which is only sending data to some aggregator (only udp_send_channel, no rcv channels or tcp channels) generates some negative energy here. And that was it. My mistake was that I thought that when deaf=no is set than it doesn't matter as in UDP unicast we just don't listen - unless we have recv_channel configured. So I think that this could be a bug - not a feature. What's your thoughts on this? -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] UDP / mute=yes and high cpu usage
Set deaf=yes. Let me know if that lowers the CPU usage. Vladimir On 02/07/2014 09:59 AM, Maciej Lasyk wrote: Sure, it's not that long so I'm posting it in-place: globals { daemonize = yes setuid = yes user = ganglia debug_level = 0 max_udp_msg_len = 1472 mute = no deaf = no allow_extra_data = yes host_dmax = 0 /*secs */ cleanup_threshold = 300 /*secs */ gexec = no send_metadata_interval = 60 /*secs */ } cluster { name = somecluster owner = someowner latlong = unspecified url = unspecified } host { location = host } udp_send_channel { bind_hostname = yes port = 8649 ttl = 2 host = 192.168.1.23 } .. and here go modules / metrics On Fri, Feb 07, 2014 at 09:15:56AM -0500, Vladimir Vuksan wrote: Maciej, can you post top 100 lines or so of your config ie. with all the udp channels etc. Thanks On 02/07/2014 08:52 AM, Maciej Lasyk wrote: Hi guys, I've been struggling with very high cpu usage of my gmond daemons lately. I've been using UDP unicast topology. I just couldn't find the source of my problem. All my gmonds on my servers were generating ~99-100% of procs usage. stracing gmond processes revealed zounds of epools and gettimeofdays syscalls (like very many every second) and that was all. I tried to start from scratch - there was no problem when using default config (mutlicast topo, deaf/mute=no). So after playing a while I thought that _maybe_ deaf=no on gmond which is only sending data to some aggregator (only udp_send_channel, no rcv channels or tcp channels) generates some negative energy here. And that was it. My mistake was that I thought that when deaf=no is set than it doesn't matter as in UDP unicast we just don't listen - unless we have recv_channel configured. So I think that this could be a bug - not a feature. What's your thoughts on this? -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. [1]http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list [2]Ganglia-developers@lists.sourceforge.net [3]https://lists.sourceforge.net/lists/listinfo/ganglia-developers References Visible links 1. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk 2. mailto:Ganglia-developers@lists.sourceforge.net 3. https://lists.sourceforge.net/lists/listinfo/ganglia-developers /usr/bin/xdg-open: line 402: htmlview: command not found /usr/bin/xdg-open: line 402: firefox: command not found /usr/bin/xdg-open: line 402: mozilla: command not found /usr/bin/xdg-open: line 402: netscape: command not found -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ganglia-web package at risk
What I was suggesting is to add dynamic download automatically. Can't bootstrap pull external files ? On 01/31/2014 10:06 AM, Daniel Pocock wrote: Another thing to consider is to have the packager download problematic JS files and download them directly of jquery.com. Daniel can that be done ? That creates more work for the person making the package: Essentially, the packager has to a) download the tarball created from the tag in github b) remove stuff c) add stuff (unless it is available from other packages, like jquery) d) create a new tarball While some people do that for their packages, the extra effort involved in doing this means there is less time to spend on other work that might help improve this or other free software, so it is better to just come up with a solution for the official ganglia-web tarballs to be compliant -- WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Web 3.5.12 released
Ganglia Web 3.5.12 has been released. Changes in this release are * Fix for failure to create heatmaps https://github.com/ganglia/ganglia-web/pull/222 Download the release from https://sourceforge.net/projects/ganglia/files/ganglia-web/3.5.12/ -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Announcement: Ganglia Web 3.5.11 released
Happy New Year to everyone. Ganglia Web 3.5.11 has been released. Changes in this release are Improved cluster load heat map https://github.com/ganglia/ganglia-web/pull/212 Fix for a XSS when supplying a host regular _expression_ filter https://github.com/ganglia/ganglia-web/issues/218 Numerous other enhancements and fixes Download the release from https://sourceforge.net/projects/ganglia/files/ganglia-web/3.5.11/ Vladimir -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Gmetad bottlenecks
On 12/07/2013 02:23 PM, Chris Burroughs wrote: On 12/06/2013 03:36 PM, Vladimir Vuksan wrote: The Ganglia core is comprised of two daemons, `gmond` and `gmetad`. `Gmond` is primarily responsible for sending and receiving metrics; `gmetad` carries the hefty task of summarizing / aggregating the information, writing the metrics information to graphing utilities (such as RRD), and reporting summary metrics to the web front-end. Due to growth some metrics were never updating, and front-end response time was abysmal. These issues are tied directly to `gmetad`. Were these failures totally random or grouped in some way? (Same cluster, type, etc). We run multiple dozens of clusters and some of the larger clusters ie. clusters that had 2-3x machines that other clusters would exhibit either gaps, slower updating ie. data points would update on 3 or 4x poll period if you looked on graph details. Also Grid summary views disappeared altogether. In our Ganglia setup, we run a `gmond` to collect data for every machine and several `gmetad` processes: * An interactive `gmetad` process is responsible solely for reporting summary statistics to the web interface. * Another `gmetad` process is responsible for writing graphs. Are these two gmetad process co-located on the same server? I think this is an interesting option that I at least was not aware of. This set up is very similar to what you have. Basically have one gmetad that polls all the same gmonds however has write_rrd off and is used for both alerting and feeding the web interface. Did you go with this setup to alleviate the problems described above or for other reasons? It was mostly to improve web interface responsiveness. Perhaps with the latest changes we no longer have to do it but we haven't tested that yet. Frankly I don't understand rrdcached. The OS already has a fancy substytem for keeping frequencly accessed data in memory. If we are dealing with a lot of files (instead of a database with indexes where the applicaiton might have more information than the OS) why fight with it? (Canonical rant: https://www.varnish-cache.org/trac/wiki/ArchitectNotes) Anyway we have had much better luck with tuning the page cache and disabling fsync for gmetad. http://oss.oetiker.ch/rrdtool-trac/wiki/TuningRRD Adminitedly at least some of the problems we had with rrdcached could have been due to the issues you have identified. In the process of doing this, I noticed that ganglia used a particularly poor method for reading its XML metrics from gmond: It initialized a 1024-byte buffer, read into it, and if it would overflow, it would realloc the buffer with an additional 1024 bytes and try reading again. When dealing with XML files many megabytes in size, this caused many unnecessary reallocations. I modified this code to start with a 128KB buffer and double the buffer size when it runs out of space. (I made a similar change to the code for decompressing gzip'ed data that used a similar buffer sizing paradigm). This sounds like a solid find. I'm a little worried about the doubling though since as you said the responses can get quiet large. Is there a max buffer size? Does your fix also handle the case of gmetad polling other gmetad? I don't think we know yet as we don't have that kind of a setup. After all these changes, both the interactive and RRD-writing processes spend most of their time in the hash table. I can continue improving Ganglia performance, but most of the low hanging fruit is now gone; at some me point it will require: * writing a version of librrd (this probably also means changing the rrd file format), I think something got cut off here. It meant to say an improved librrd. * refactoring gmetad and gmond into a single process that shares memory I'm not sure I folow this one. While the node with gmetad likely also has gmond, gmond typically runs alone. The local gmond is also not necessarily reporting directly to the co-located gmetad. Idea is that for certain scenarios e.g. centralized metrics host where all gmonds run it makes some sense to have gmetad accept all metrics over UDP instead of doing the dance of downloading XML tree of metrics from gmond, parsing and populating it's own hash. It's unnecessary overhead. Vladimir -- Sponsored by Intel(R) XDK Develop, test and display web and hybrid apps with a single code base. Download it for free now! http://pubads.g.doubleclick.net/gampad/clk?id=111408631iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Gmetad bottlenecks
Hello everyone, For few weeks now we have had performance issues due to growth of our monitoring setup. One of my colleagues Devon O'Dell volunteered to help and below is an e-mail of his findings. We'll submit a pull request once we are comfortable with the changes https://github.com/dhobsd/monitor-core/compare/master Vladimir Forwarded message Vlad emailed some time ago about issues we're having with Ganglia performance. Over the past couple weeks, I spent some time figuring out how Ganglia works and attempting to identify / solve the performance issues. The issue is fundamentally one of scale: the number of metrics we monitor times the number of servers we have (times the number of metrics we sum!) ends up being a large number. The Ganglia core is comprised of two daemons, `gmond` and `gmetad`. `Gmond` is primarily responsible for sending and receiving metrics; `gmetad` carries the hefty task of summarizing / aggregating the information, writing the metrics information to graphing utilities (such as RRD), and reporting summary metrics to the web front-end. Due to growth some metrics were never updating, and front-end response time was abysmal. These issues are tied directly to `gmetad`. In our Ganglia setup, we run a `gmond` to collect data for every machine and several `gmetad` processes: * An interactive `gmetad` process is responsible solely for reporting summary statistics to the web interface. * Another `gmetad` process is responsible for writing graphs. Initially, I spent a large amount of time using the `perf` utility to attempt to find the bottleneck in the interactive `gmetad` service. I found that the hash table implementation in `gmetad` leaves a lot to be desired: apart from very poor behavior in the face of concurrency, it also is the wrong datastructure to use for this purpose. Unfortunately, fixing this would require rewriting large swaths of Ganglia, so this was punted. Instead, Vlad suggested we simply reduce the number of summarized metrics by explicitly stating which metrics are summarized. This improved the performance of the interactive process (and thus of the web interface), but didn't address other issues: graphs still weren't updating properly (or at all, in some cases). Running `perf` on the graphing `gmetad` process revealed that the issue was largely one of serialization: although we had thought we had configured `gmetad` to use `rrdcached` to improve caching performance, the way that Ganglia calls librrd doesn't actually end up using rrdcached -- `gmetad` was writing directly to disk every time, forcing us to spin in the kernel. Additionally, librrd isn't thread-safe (and its thread-safe API is broken). All calls to the RRD API are serialized, and each call to create or update not only hit disk, but prevented any other thread from calling create or update. We have 47 threads running at any time, all generally trying to write data to an RRD file. Modifying gmetad to call the proper librrd function to call into rrdcached helped a little, but most threads were still spending all their time spinning on locks: although we were writing to rrdcached now, we were doing so over a single file descriptor to a unix domain socket. This forced the kernel to serialize the reads and writes from all the different threads to the single file descriptor. The only reason we gained any performance was due to not hitting disk. To solve this problem, I made rrdcached listen on a TCP socket and gave every thread in gmetad its own file descriptor to connect to rrdcached. This allowed every thread to write to rrdcached without locking for updates (creating new RRDs still requires holding a lock and calling into librrd). This worked, and I suspect that we'll be able to move forward for some time with these changes. They are running on (censored) right now, and we'll leave them running for a while to make sure they're good before pushing the patches upstream. In the process of doing this, I noticed that ganglia used a particularly poor method for reading its XML metrics from gmond: It initialized a 1024-byte buffer, read into it, and if it would overflow, it would realloc the buffer with an additional 1024 bytes and try reading again. When dealing with XML files many megabytes in size, this caused many unnecessary reallocations. I modified this code to start with a 128KB buffer and double the buffer size when it runs
Re: [Ganglia-developers] [Ganglia-general] Grid of Grids Broken Again in 3.6.0? Is this a different problem?
Funny you mention it. I am seeing that exact issue even though I am not running grid of grids. About a week we added bunch more machines and top grid __SummaryInfo__ is now updated only occasionally ie. I may see data once an hour. This happens even with 3.5.0 so I am suspecting there is something much deeper than this. For a while now I have been suspecting that summarization code causes major slow downs. For us this was particularly exacerbated by having couple thousand unique metric names. I have eliminated bulk of metrics from being summarized using the unsummarized_metrics which has radicallly improved performance however I suspect problem is much deeper. Unfortunately I have been busy on other stuff however I am hoping to spend some time next week working on that. Vladimir On 11/17/2013 07:17 PM, Nicholas Satterly wrote: Hi Adam, Our experience was that the summary RRDs were actually generated but then rarely updated. Only very occasionally would we see metrics suddenly get written to the RRD and only for a few intervals and then there would be large gaps again. Do graphs based on the RRDs you are getting in your tests look right? Regards, Nick On Fri, Nov 15, 2013 at 8:15 PM, Adam Compton acomp...@quantcast.com wrote: Nicholas, I'm the person who submitted #92. I've attempted to replicate the problem and I'm still seeing summary RRDs being written for the top grid in a grid-of-grids configuration (assuming you mean "/var/lib/ganglia/rrds/__SummaryInfo__/*.rrd"). Can you please share the configs you used to reproduce this issue? I'd like to fix the bug and submit a patch, but I don't know how to replicate the problem. Thanks, Adam On 11/3/13 2:04 PM, Nicholas Satterly wrote: Hi Bernard, I think this is the bug in federation that you might be thinking of as I've mentioned it before. I don't have a fix for this. It's quite a large patch and I've never looked at this part of the codebase before. Regards, Nick On Sun, Nov 3, 2013 at 5:10 PM, Bernard Li bern...@vanhpc.org wrote: My $0.02 is that Grid of Grids (federation) is still a widely used feature so we should attempt to fix it. Nick -- do you still have another outstanding pull request to fix a bug in federation? If so, what's the hold up? Just waiting for someone with authorization to accept it? Thanks! Bernard On Sat, Nov 2, 2013 at 5:14 PM, Nicholas Satterly nfsatte...@gmail.com wrote: I have confirmed that this patch [1] broke writing of the root summaries for the top-level gmetad when in a grid-of-grids setup. What should we do? Revert the patch, attempt to debug it, or just log a github issue to
Re: [Ganglia-developers] Question: what is gmetric value when slope=='POSITIVE'
Problem with slope=positive is that Ganglia treats those as counters and creates RRDs that support counter values instead of slope=both which creates "gauges". Now problem is that if you use slope=positive you have to send your counter values more often than gmetad polls gmonds ie. every 15 seconds or more frequent. If you don't there will be polling periods where counter value is the same which RRD will obviously interpret as delta of zero. Hence I avoid using slope=positive since it's usually trouble :-(. On 09/20/2013 10:23 AM, Vyacheslav Artyukhov wrote: Good day! I'm new in ganglia and gmetric and I have an issue with them. For my metric (jmx enqueue) I use gmetric twice: the first one with slope = 'both' and the second one with slope = 'positive'. So, I have two diagrams on ganglia web interface. Absolute value of my metric (that is displayed in the first diagram, that has slope = 'both') increases by one every two seconds. But on the second diagram (that has slope = 'positive' ) I see values of about 50m. This confuses me. I expect to see the derivative on the second diagram. My metric increases by one every two second than derivative I expect is 1/2=0.5. So, my question is - what does exactly gmetric send to ganglia when slope = 'positive'? -- LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/22/13. http://pubads.g.doubleclick.net/gampad/clk?id=64545871iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Possibility of using different serialization format than XDR
I am not necessarily opposed to it if it's implemented in such a way not to break backwards compatibility. Someone would need to contribute some code. Vladimir On Fri, 26 Jul 2013, Dave Rawks wrote: I'm curious to hear what you think is going to be more efficient, platform agnostic and portable than XDR? ASN1 would be the only thing I would even consider using instead, but it is arguable whether it would be worth the pain of supporting more than one serialization format and it certainly doesn't seem sane to break all backwards compatibility to switch to something new unilaterally. ASN1 /might/ be a reasonable alternative to XDR, but I don't see what advantages this could possibly bring. -Dave On 7/26/13 10:46 AM, Nikhil wrote: Hi, Considering that we have better and compute efficient and binary serialization open formats out there . How hard would it to make Ganglia use them instead of XDR ? Can the serialization format engines be pluggable, instead of being closely integrated with XDR? Is it still worth continuing to stick with XDR? The intention is to understand and see the possibility and have a discussion what could be best to go with, if its appropriate. I am really hoping to see the reply from the authors of ganglia core :-) Thanks, Nikhil -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia on top of Quercus
Web UI doesn't connect to gmetad over UDP but over TCP. Not really sure Vladimir On Tue, 9 Jul 2013, Tim Hawes wrote: Hello, I am in need of some debugging direction here. I have Ganglia 3.1.2 installed and running. The web interface works perfectly fine under php-fcgi. We have a vested interest in getting the web interface to work using the Java php interpreter, Quercus, on Glassfish. Using the same gmond and gmetad on the working implementation, however I am getting this message from Quercus: There was an error collecting ganglia data (127.0.0.1:8652): XML error: No error at 0 I have tried getting Quercus to report specific php errors to no avail, and the Glassfish logs are silent. I thought, at first it may be a poor implementation of fsockopen(), and found there was such a bug in old versions of Quercus for udp calls. That appears to be fixed, this code works under Quercus: ?php $errno = null; $errstr = null; $fd = fsockopen('udp://localhost', 8652, $errno, $errstr); if ($fd) print success\n; else print $errstr; ? indicating a successful UDP connection to gmetad. But the message indicates an XML error, and I am uncertain how I might test this. As soon as I find the issue, I'd like to present a bug report to the Quercus team. -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Graph showing 1K instead of GB's about Disk's in Ganglia/RRDTool
Unfortunately units for disk are GB so 1.4k GB is 1.4 Terabytes. This was unfortunate decision however changing the units to Bytes and using rrdtools scaling will introduce inconsistencies with older client versions. Vladimir On Thu, 9 May 2013, Valter Silva wrote: I'm using Ganglia and RRDTool to show charts in a web page. Everything is fine, but for some machines the graphs about DISK are with some kind of bug. Here is how they look in some machines (both machine are in the same cluster): This one is correct, about the disk space: correct But this one is alway showing 1.4Kb of disk space. Which is incorrect. How can I fix this ? Any idea ? I already uninstall it and install it many times, but it doesn't seem to fix the problem. error I also create a post in StackOverFlow for the same issue.-- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia 3.6.0 pre-release
Ganglia 3.6.0 has been tagged and is available for download in our pre-release directory https://sourceforge.net/projects/ganglia/files/pre-release/ if there are no issues in about a week we'll move it to the official release directory. Release notes can be found here https://github.com/ganglia/monitor-core/wiki/Release-Notes Vladimir -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Integration with Graphite - Missing values in whisper
I imagine that opening and closing TCP connections for each metric doesn't scale. Few days ago we merged a pull request that uses UDP to send metrics to Carbon https://github.com/ganglia/monitor-core/pull/101 that should be far more scalable. Vladimir On Tue, 23 Apr 2013, Maziyar Mirabedini wrote: I did more testing and research on this. We're obviously hitting a bottleneck on this.Monitored the logs and found that the problem is that gmetad opens and closes a connection for every metric it wants to send to Graphite. Also you can only specify one Carbon server and port so we were stuck. We were able to write a python script that went directly to Ganglia port on a server for each cluster, gather metrics and package all metrics for each server into one message and send the metrics to Graphite. On Graphite side, we have one carbon-relay and 6 carbon-cache setup with rules to point each cluster to a carbon-cache. We have 5 scripts running at the same time and its able to gather, parse and send everything to carbon and carbon write it to disk within 10 seconds. This is a huge improvement. We'll make the script available sometime soon.. On Mon, Apr 22, 2013 at 10:33 AM, Maziyar Mirabedini mmirabed...@gmail.com wrote: Hi there, I recently set up a server that hosts both Ganglia 3.5, Graphite 0.9.10, RRDTool 1.4.7 with RRDCACHED enabled and configured. Then I set up the integration between Ganglia and Graphite by setting the carbon_server, carbon_port and prefix in gmetad conf file. Configured the heartbeat for each cluster to happen every 60 seconds. The first RRA is configured such that it keep data every 60 seconds for a week. Since this server is only for monitoring and we have tons of metrics for each server I modified the Carbon conf file to have the following: MAX_UPDATES_PER_SECOND = inf MAX_CREATES_PER_MINUTE = 100 MAX_QUEUE_SIZE = 10 Whisper retention is configured such that it matches Ganglia. Once the services were started I found that: 1) All RRD files got created for the clusters and servers. At this point the server is monitoring 5 clusters and in total 94 servers and roughly 500 metrics per server. 2) Fetching the data from RRDs show that gmetad is able to update every single RRD on time and the data points are there every 1 min. 3) All metrics are appropriately created in Graphite. 4) Noticed that Graphite metrics are not updated as often as RRDs. The updates to metrics seem to happen sporadically. Sometimes one metric is updated every 2mins other times it wouldn't get updated for another 6 mins. I haven't seen the metric get updated every 1 min as per RRD retention consistently. I confirmed this by doing a fetch on both whisper and RRDs. doing a tail on Graphite's update log I can see tons of updates going through .. but maybe its just not fast enough?? I don't see any errors in /var/log/messages. any help would be really appreciated! Thanks! -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Sending aggregated cluster metrics to Graphite
I think this may be due to different usage pattern for graphite users ie. you could aggregate individual values using the graph composer. Vladimir On Mon, 15 Apr 2013, Nicholas Satterly wrote: Hi, We're looking at using the support for sending ganglia metrics to graphite however I've just worked out that aggregated cluster are not sent. Can anyone explain why this might be the case? Could it be because you would actually need to send two metrics for every cluster metric ie. the num and sum? Even so, it that an issue? Thanks, Nick -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] after XML_ParseBuffer() error: gmetad, port 8652 becomes unresponsive
Run the XML output through xmllint e.g. something like nc localhost 8651 | xmllint - may give you hints. On Fri, 5 Apr 2013, Ramon Bastiaans wrote: Ah. I also suspect some weird gmetric to cause this, but so far have not been able to find it in the XML unfortunately. Well regardless of the cause, I think it should not cause the interactive port to stop responding and for the web interface to hang. Having a quick look at the source of gmetad I was not able to find where this might originate. Perhaps the web interface could fail back to port 8651 if port 8652 times out. - Ramon P.S. pbs-python still alive and well. If you mean Job Monarch I have been working hard recently on a new release and it is near (99%) finished. ;) pbswebmon is a completely different project which SARA is not associated with or has any role in. As of January 2013, SARA has a new name: SURFsara. ing. Ramon Bastiaans - Senior Systems Programmer - Cluster Computing | Operations, Support Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0)20 592 30 00 | ramon.bastia...@surfsara.nl | www.surfsara.nl | On 4 apr. 2013, at 18:52, Chris Hunter chris.hun...@yale.edu wrote: Hi, We have seen this before (ganglia-gmond 3.2) when there are whitespace or non-alphanumeric characters in custom gmetrics. PS I hope pbs-python/pbswebmon are still active... Hi, We have been experiencing a weird issue with gmetad. I am running gmetad v3.4.0 Once in a while now a XML error seems to occur. Like this: /usr/sbin/gmetad[12241]: Process XML (LISA Cluster): XML_ParseBuffer() error at line 525626: Besides what is causing that and why, this causing the Ganglia web front end to hang and become non responsive. After checking the gmetad it seems port 8652 is no longer responding to queries. This does nothing: # telnet localhost 8652 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. /LISA Cluster after about 1 minute Connection closed by foreign host. However port 8651 still works: # telnet localhost 8651 | wc -l Connection closed by foreign host. 921410 And when I switch the web frontend from port 8652 back to port 8651 ($conf['ganglia_port'] = 8651;), the web page responds and works again. After restarting gmetad port 8652 also becomes responsive again. It almost seems gmetad has a thread lost it's way or something. Any idea what may be causing this (besides the XML error)? It seems weird to me if 1 port works and the other does not anymore. It might be a bug. I have a dump of the XML (from port 8651 before restarting) available for who might want it, but it is 42 MB. Kind regards, - Ramon. As of January 2013, SARA has a new name: SURFsara. ing. Ramon Bastiaans - Senior Systems Programmer - Cluster Computing | Operations, Support Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0)20 592 30 00 | ramon.bastia...@surfsara.nl | www.surfsara.nl | = -- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Web 3.5.7 released
Ganglia Web 3.5.7 has been released since some of the Javascript files were left out of packaging. https://sourceforge.net/projects/ganglia/files/ganglia-web/3.5.7/ Vladimir -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Web 3.5.6 released
Ganglia Web 3.5.6 has been released. Major changes are * Number of fixes to address XSS (Cross Site Scripting) issues * Enhancement to the host view if use option metric_groups_initially_collapsed. Clicking on metric groups dynamically loads images instead of reloading the page * Incorporate legend in the selection when doing Inspect graph You can find the announcement here http://ganglia.info/?p=566 Everyone with publicly available Ganglia installs is advised to upgrade ASAP. You can read more on Cross Site scripting on Wikipedia http://en.wikipedia.org/wiki/Cross-site_scripting Vladimir -- Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia 3.5.0 pre-release
I have packaged latest version of Ganglia 3.5.0 available here http://sourceforge.net/projects/ganglia/files/pre-release/ganglia-3.5.0.tar.gz/download Major changes are - Separate thread in gmond to handle connections from gmetad. - New metrics e.g. cpu_steal - Improvements to Python collection scripts - Misc bug fixes I have been running it for about 2 weeks in production and haven't seen any major issues. Please give it a try and if there are no issues reported by the end of the week binaries it will be released as 3.5.0. Thanks, Vladimir -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ganglia.info down
Thinking whether we ought to move it off to something like Github Pages. Thoughts ? Vladimir On Mon, 5 Nov 2012, Alex Dean wrote: On Nov 5, 2012, at 2:08 PM, Nicholas Satterly wrote: Looks ok to me and to isup... http://www.isup.me/ganglia.info/ --Nick. Looks ok from here now as well. -- LogMeIn Central: Instant, anywhere, Remote PC access and management. Stay in control, update software, and manage PCs from one command center Diagnose problems and improve visibility into emerging IT issues Automate, monitor and manage. Do more in less time with Central http://p.sf.net/sfu/logmein12331_d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- LogMeIn Central: Instant, anywhere, Remote PC access and management. Stay in control, update software, and manage PCs from one command center Diagnose problems and improve visibility into emerging IT issues Automate, monitor and manage. Do more in less time with Central http://p.sf.net/sfu/logmein12331_d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Gmetad query question
I haven't checked the code to look into why it returns full XML output on non-matches however I would recommend taking a look at the Nagios integration with Ganglia Web that could provide you with similar functionality https://github.com/ganglia/ganglia-web/tree/master/nagios Vladimir On Tue, 25 Sep 2012, Mikko Herranen wrote: I would like to ask about the gmetad query interface. If you try to query a non-existent subtree, you get the entire xml dump in return. This is because the process_path function in server.c contains code that returns the entire xml dump whenever a part of the query path is not recognized. Is this really intentional and what is the rationale for that? Could it be changed to return nothing? I'm asking because we use the query interface to reduce network traffic when we need only a single metric. Unfortunately, we don't know in advance which nodes have which metrics. If we query for a node/metric combination that does not exist, we get the full xml dump which is counterproductive. There are ways to fix this on our side, such as keeping track of the available node/metric combinations. However, they are not as clean as fixing the root cause, assuming it can be fixed. -- How fast is your code? 3 out of 4 devs don\\\'t know how their code performs in production. Find out how slow your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219672;13503038;z? http://info.appdynamics.com/FreeJavaPerformanceDownload.html ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] override_ip causing gmond to crash
IIRC we tried to use APR for portability but we saw crashes in that piece of code on certain platforms (Ubuntu comes to mind). We could try to fix it the right way again. Vladimir On Wed, 26 Sep 2012, Nicholas Satterly wrote: Hi, I've discovered that on some of our systems (perhaps only half a dozen out of 500 or so) gmond crashes if the override_ip configuration option is set. I've worked out that the problem is something to do with this block of code... #if 1 char* tmpstr = malloc( strlen(( override_ip != NULL ? override_ip : override_hostname )) + strlen( override_hostname ) + 1 ); strcpy (tmpstr, (char *)( override_ip != NULL ? override_ip : override_hostname ) ); strcat (tmpstr, :); strcat (tmpstr, (char *) override_hostname); cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = tmpstr; #endif #if 0 cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = apr_pstrcat(gm_pool, (char *)( override_ip != NULL ? override_ip : override_hostname ), :, (char *) override_hostname, NULL); #endif What I'm trying to understand at the moment is why the apr_pstrcat version is #if 0 commented out when it seems to work OK during my testing. Thanks, Nick -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ERROR: parameter 'foo' does not represent a number in STACK
There is a fix in 3.5.3 dealing with adding data sources that have NAN values which may be causing this issue. I have clusters with 200+ machines and I do not see this particular behavior in 3.5.3. Vladimir On Mon, 17 Sep 2012, John Desantis wrote: Hello all, I originally posted this message to the ganglia-users list, but I never received a response so I figured I'd go ahead and email the ganglia-developers list. As with usual mailing list respect, I apologize if I have somehow missed a pre-existing thread that covered the error that I'm experiencing and cannot seem to solve. I've noticed that my larger clusters (60+ nodes) don't report stacked graphs properly, i.e. (and specifically!) load_one. All I get is a broken image link with the following error log snippet taken from an Apache log: ERROR: parameter 'a139' does not represent a number in line STACK:a139#E50009:foo_hostname I also ran the rrdtool command on the console itself (taken by adding debug=5 into the graph URL) afterwards and got the same error: ERROR: parameter 'a139' does not represent a number in line STACK:a139#E50009:foo_hostname I've tried updating rrdtool from 1.2.27 to 1.4.5-1; taking a shot in the dark by updating stacked.php per https://github.com/ganglia/ganglia-web/issues/114 and https://github.com/jbuchbinder/ganglia web/commit/727b1ffc9e3e8a2979ea23107acefca53ef5ddf0 (although stacked.php in ganglia-web 3.5.2 already has this in place); adjusting timeouts within Apache and PHP (another shot in the dark); removing the load_one.rrd from /var/lib/ganglia/rrds... directory; restarts of both gmetad and gmond (although it definitely seems to be an rrdtool error). I've searched through the rrdtool user mailing list and haven't found any useful posts. I was wondering if anyone else has run into this issue before and if they were able to get it resolved and if so, what suggestions one may have. Thanks for your time, John DeSantis -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.4.1. release ?
There are a number of small fixes and one enhancement e.g. report CPU steal which I feel would be a great improvement. I don't believe we need to go all the way to 3.5.0 to roll those out. We can but then we have proliferation of versions any time we add any enhancements. I do not want to wait until end of August. I personally prefer to release often. Most of these fixes have been in the trunk for 2 months already so I see no reason to delay any further. Vladimir On Wed, 15 Aug 2012, Daniel Pocock wrote: On 15/08/12 03:22, Vladimir Vuksan wrote: All of the recent ones. Typically, the 3.4.1 release would ONLY have essential changes that won't break any existing installation - bug fixes, no new features If some of the changes do have any risk or require mandatory config change, they will be released in 3.5.0 I haven't gone over them yet, but do you believe they all belong in 3.4.1? Personally, I feel it is not a good idea to release anything in August because people are on vacation and there is an upcoming deadline for the book - people are going over tech review feedback and may not thoroughly test any release candidate of 3.4.1 Do you think it is fair to delay 3.4.1 or 3.5.0 until mid-September? Thanks, Vladimir On Tue, 14 Aug 2012, Daniel Pocock wrote: On 13/08/12 22:59, Vladimir Vuksan wrote: I think we should go ahead and release 3.4.1. Anyone wants to do the deed :-)? Which features should be cherry picked from trunk? -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.4.1. release ?
All of the recent ones. Thanks, Vladimir On Tue, 14 Aug 2012, Daniel Pocock wrote: On 13/08/12 22:59, Vladimir Vuksan wrote: I think we should go ahead and release 3.4.1. Anyone wants to do the deed :-)? Which features should be cherry picked from trunk? -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] 3.4.1. release ?
I think we should go ahead and release 3.4.1. Anyone wants to do the deed :-)? Vladimir -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] add extras parsing to json graph-definitions
What Alex said. - Logarithmic is universal so that shouldn't be controversial Lower-limit and upper-limits are already implemented. Rigid is something very rrdtool specific so I want to punt on it at this time. Vladimir On Thu, 19 Jul 2012, Alex Dean wrote: On Jul 19, 2012, at 9:12 AM, Jeff Buchbinder wrote: On Thu, Jul 19, 2012 at 8:54 AM, Jochen Hein joc...@jochen.org wrote: Vladimir Vuksan vli...@veus.hr writes: I would define a scaling factor or some other variable. I do want to steer away from having tool specific options unless absolutely necessary. I agree that would be a useful goal, I just have no idea what I options I may need for my special problem [see my mail to ganglia-general]. I've had a look at the other PHP-reports. A couple of them pass the option '--rigid' to rddtool. Other used options are --logarithmic and --lower-limit. I've no idea how that could be mapped into json and keep the syntax and the parsing simple. I'd suggest something like: options: { logarithmic: true, rigid: false, lower-limit: 0 } with sensible defaults. If it's ignored by another graphing toolkit, that's fine. Silently ignoring unsupported options seems like a bad choice to me, because it makes graph authoring troubleshooting more difficult. If someone uses an unsupported option, we should blow up with an informative error message ASAP. Think about when you've misspelled a configuration option (in any system, not just gweb), and wasted a lot of time hunting for the cause of the odd results you see - totally frustrating, and totally preventable. More generally: The JSON format should be focused on making the simple common cases easy. The format should be small, simple, easy to use. Allowing lots of platform-specific options dilutes this value. The PHP route is still available for the non-trivial cases. If you need RRD-specific features, then I think PHP is the way to go. Presenting platform-specific options in a seemingly platform-agnostic way is worse. If we are going to support JSON options which are really RRD-specific, we need to make this clear. One suggestion on how to do this (which I don't like as well, but I'll mention it anyway): Put RRD-specific options into something like {rrd-options:--color --rigid --whatever}. This at least gives someone else a clue that the JSON graph is intended only for use with RRD, rather than leaving it up to gweb to determine how to interpret logarithmic or rigid for any graphing platform we want to try supporting. alex -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] web and debian/* files
Exactly. The other day I was actually missing the debian/ directory in ganglia-3.4.0 and as a result couldn't build it due to short timeline. Can we put stuff like that back in. If Debian guys don't like it we can strip it out for them. Vladimir On Thu, 19 Jul 2012, Bernard Li wrote: Hi Daniel: One reason why we would want to keep the debian/ directory in our repo is if for whatever reason the upstream Debian package doesn't get updated, a user could still download our official tarball and build Debian packages directly. Cheers, Bernard On Thu, Jul 19, 2012 at 7:41 AM, Daniel Pocock dan...@pocock.com.au wrote: The web download includes a debian/ directory with files for building a Debian package Debian also keeps a separate set of files for the same purpose in the Debian git VCS: git.debian.org/git/collab-maint/ganglia-web.git When importing release tarballs into the Debian VCS, it is necessary to filter out the upstream version of the debian/* files, they don't get used at all: git-import-orig -u 3.5.2 --filter='debian/*' ~/Downloads/ganglia-web_3.5.2.orig.tar.gz To simplify things and avoid duplication, debian/* could be omitted from future gweb releases (and even removed from the git repo) Would anyone object to that? I believe that the main advantage of tracking these files in Debian's git is that any Debian developer can update them and update an emergency bug fix release at any time. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] add extras parsing to json graph-definitions
I don't like it. It's rrdtool specific. I want something a little bit more generic that gets translated to rrdtool commands. Vladimir On Wed, 18 Jul 2012, Jeff Buchbinder wrote: On Wed, Jul 18, 2012 at 5:06 AM, Jochen Hein joc...@jochen.org wrote: Hi, I'm working on getting ganglia to display correct units for one custom graph. One of the first tries was to give some extra option to rrdtools. When using php-graphs, this is done via $rrdtool_graph[ 'extras' ], but there is nothing for json graphs. That simple patch (against 3.4.2) should fix that: --- ganglia/graph.php.orig 2012-07-17 16:25:37.0 +0200 +++ ganglia/graph.php 2012-07-18 10:44:45.0 +0200 @@ -28,6 +28,7 @@ sanitize( $graph_config[ 'vertical_label' ] ); } + $rrdtool_graph[ 'extras' ] = $graph_config[ 'extras' ]; $rrdtool_graph['lower-limit'] = '0'; if( isset($graph_config['height_adjustment']) ) { Jochen I'll apply that patch to the tree. Thanks! Jeff -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] add extras parsing to json graph-definitions
I would define a scaling factor or some other variable. I do want to steer away from having tool specific options unless absolutely necessary. Vladimir On Wed, 18 Jul 2012, Jeff Buchbinder wrote: On Wed, Jul 18, 2012 at 9:55 AM, Vladimir Vuksan vli...@veus.hr wrote: I don't like it. It's rrdtool specific. I want something a little bit more generic that gets translated to rrdtool commands. Eventual suggestion: we define a function to convert between whatever extras format we're using and target rendering tools. It's probably a good idea to have it be an array rather than just plain text. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Web 3.5.2 released
http://ganglia.info/?p=552 Ganglia Web 3.5.2 has been released. Major changes are - Fix for stacked graphs not showing after upgrading to 3.5.1 - Inspect graph now uses AJAX calls to retrieve data which should help in situations where users use Basic authentication You can find release notes here https://github.com/ganglia/ganglia-web/wiki/Release-Notes Download the release from https://sourceforge.net/projects/ganglia/files/ganglia-web/3.5.2/ -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] [SECURITY] [IMPORTANT] Security issue in Ganglia Web
There is a security issue in Ganglia Web going back to at least 3.1.7 which can lead to arbitrary script being executed with web user privileges possibly leading to a machine compromise. Issue has been fixed in the latest version of Ganglia Web which can be downloaded from https://sourceforge.net/projects/ganglia/files/ganglia-web/3.5.1/ If you are running Ganglia Web open on the internet you are advised to upgrade ASAP or at a minimum password protect access to Ganglia Web. We'll have a write up about details of the vulnerability in few days. Sincerely, Vladimir -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Overlay timeshifted data
That depends. We bumped up resolution in 3.3.0+ to store last 2 weeks at 15 second resolution. Vladimir On Thu, 31 May 2012, Chris Burroughs wrote: Really exciting. But I'm confused how this works with the round robin nature of RRD. Don't we by default only have (for example) daily data for past 24 hour period, not 48 hours? On 05/16/2012 07:54 PM, Vladimir Vuksan wrote: There is a blog post about a new feature in Ganglia Web called overlay timeshifted data http://ganglia.info/?p=543 -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list ganglia-gene...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Overlay timeshifted data
There is a blog post about a new feature in Ganglia Web called overlay timeshifted data http://ganglia.info/?p=543 -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Overlay timeshifted data
Obviously this is a first pass at this feature. I have thought about comparing arbitrary time periods but I thought a generic UI for that may be tricky to implement. Also currently this allows overlaying only a single metric. It is possible to do overlaying on aggregate graphs or reports but then it may get messy since you double the number of lines. We'll see. Vladimir On Wed, 16 May 2012, Aaron Nichols wrote: On Wed, May 16, 2012 at 5:54 PM, Vladimir Vuksan vli...@veus.hr wrote: There is a blog post about a new feature in Ganglia Web called overlay timeshifted data http://ganglia.info/?p=543 This is awesome. We've been working on a similar ability just using php based reports. Two things additional we wanted to be able to do besides just comparing one time period to the previous: a) Compare two arbitrary time periods. In our case we run performance / load tests want to be able to compare a loadtest run X minutes ago with one run Y minutes ago. The delta between X Y is the timeshift. We can do this for any given interval (1h, 2h, 1d, 1w, etc) b) Rather than just show the two metrics overlayed, we wanted to calculate the % difference between two periods. This becomes more useful in production where we want to know if a particular trend has changes more than N% over some period of time (say post-deployment). Down the road we also want to try to normalize the previous period by averaging multiple previous periods. For example, compare Monday this week w/ the average of the last 3 Monday's. I can pass along a blog post here with more details when I get it written up - but being able to do those two things was pretty crucial to us may be useful to others. Thanks for the continued work on this - awesome progress. Aaron -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Web 3.4.2 released
Ganglia Web 3.4.2 has been released. Notable changes are * Improvements to the live dashboard * Fixed the aggregate graphs metric auto complete which broke in 3.4.1 * Add ability to specify critical and warning thresholds. Use in Live Dashboard and Views. * Minor bug fixes Release notes can be found here if you want to see changes from previous releases https://github.com/ganglia/ganglia-web/wiki/Release-Notes You can download latest release from https://sourceforge.net/projects/ganglia/files/ganglia-web/ Vladimir -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia Web 3.4.1 released
Yes this will work with any version of Ganglia Core. Vladimir On Fri, 4 May 2012, Steven A. DuChene wrote: Will this work on top of any version of ganglia or do I need a minimum version of ganglia installed to use this? I installed it on top of ganglia-3.2.0 and it seemed to work but my other ganglia install has 3.1.7 so I am wondering if this will work there. -- Steven DuChene -Original Message- From: Vladimir Vuksan vli...@veus.hr Sent: May 2, 2012 7:08 PM To: ganglia-developers@lists.sourceforge.net, ganglia-gene...@lists.sourceforge.net Subject: [Ganglia-developers] Ganglia Web 3.4.1 released With releases 3.4.0 of Ganglia Core we are gonna split off Ganglia Web as development of the web interface goes much quicker and usually there is no need to upgrade gmond/gmetad when when you upgrade Ganglia Web. Today we are releasing Ganglia Web 3.4.1. You can download it from https://sourceforge.net/projects/ganglia/files/ganglia-web/3.4.1/ Release notes can be found here https://github.com/ganglia/ganglia-web/wiki/Release-Notes Vladimir -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.3.7 tagged (release candidate)
No objections to 3.3.7. On Fri, 27 Apr 2012, Daniel Pocock wrote: On 20/04/12 05:56, Bernard Li wrote: BTW, I can't seem to find the 3.3.7 tarball in the pre-release section, the most recent release is 3.3.6. I'm not sure what happened, either I forgot to click the button to confirm the upload, or it isn't on the mirrors (sometimes it takes hours, sometimes very quick) I've tried to upload again, please let me know if you still can't get it Please also let me know if it now works for the hosts you mention below. I see there have been 17 downloads Can anyone confirm that 3.3.7 has resolved the Solaris container issue? Are there any objections to making 3.3.7 the official release today? -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond udp receive buffer errors
Right. I have a few VMs aggregators as well as physical hardware. VMs have more issues than physical hardware but are still susceptible to loss. This is very evident with metrics that arrive at the same time e.g. cron triggered gmetric jobs. Also something unexpected happened. I have two VMs that are a pair ie. all nodes send metrics to both in case one fails we still have metrics. I upgraded e.g. aggregator2. I did not touch aggregator1 yet UDP errors vanished on aggregator1 as well. Puzzling. Vladimir On Mon, 23 Apr 2012, Daniel Pocock wrote: On 23/04/12 22:24, Vladimir Vuksan wrote: I was having identical issues. I used your patch with the exception that I bumped up buffer size first to 10M from 1M you had. There was a massive improvement but still was seeing some drops so I just decided to bump it up to 30M and it's even better although I still see occasional drops. If you have such a big buffer, then you could also have latency issues, as it suggests your CPU is just not able to process all the work in time You would either need to revise the workload (by splitting clusters, etc) or re-write gmond to be multithreaded (so it can use more cores) -- For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond udp receive buffer errors
I was having identical issues. I used your patch with the exception that I bumped up buffer size first to 10M from 1M you had. There was a massive improvement but still was seeing some drops so I just decided to bump it up to 30M and it's even better although I still see occasional drops. To really see the effect you need to in addition to rcvbuffer track udp_inerrors. Vladimir On Mon, 23 Apr 2012, Ramon Bastiaans wrote: Ah ok. Before you sent your email I had already created a small patch for myself. It almost seems that APR ignores the OS settings (i.e.: net.core.rmem_default) and creates a socket with it's own default (receive) buffer size. Attached is a patch against 3.3.6 for lib/apr_net.c that stops the receive buffers errors for me. The patch sets the buffer size a bit bigger, although I'm not sure what would be a sensible size for gmond. I would think if you have a large cluster with lots of UDP traffic you would need a bigger receive buffer than for smaller systems. I will try out 3.3.7 and see what it's debug output says on buffer size's. -- For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Proposed fixes for memory leak [monitor-core] Fixes/bz327 (#30) (fwd)
Sounds good. Let's go with 3.3.5 and put in a fix in 3.3.6. Thanks, Vladimir On Mon, 2 Apr 2012, Daniel Pocock wrote: On 02/04/12 20:10, Vladimir Vuksan wrote: Anyone want to look over this pull request and merge it if it looks good ? Even if the fixes are perfect, we would still need to push back the official release by another 7 - 10 days (because of Easter) Therefore, I propose: a) we release 3.3.5 as-is, because none of these things is a regression (in other words, the same bugs exist in the current public release 3.3.1) b) the fixes go onto master and the 3.3 release branch c) I'll make a 3.3.6 tarball and put it on the pre-release download section, maybe Wednesday d) if that tarball is good, it is becomes the official release on 16 April -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Proposed fixes for memory leak [monitor-core] Fixes/bz327 (#30) (fwd)
Anyone want to look over this pull request and merge it if it looks good ? Vladimir -- Forwarded message -- Date: Mon, 2 Apr 2012 10:58:00 -0700 From: Kostas Georgiou To: Vladimir Vuksan vl...@vuksan.com Subject: [monitor-core] Fixes/bz327 (#30) Fix for bz #327. First two commits should be safe but the last commit needs some testing. You can merge this Pull Request by running: git pull https://github.com/georgiou/monitor-core fixes/bz327 Or you can view, comment on it, or merge it online at: https://github.com/ganglia/monitor-core/pull/30 -- Commit Summary -- * Only reset the channels once every 60 * APR_USEC_PER_SEC * Fixes memory leak in join_mcast * Fix memory leaks by splitting reset_mcast_channels out of setup_listen_channels_pollset -- File Changes -- M gmond/gmond.c (85) M lib/apr_net.c (16) -- Patch Links -- https://github.com/ganglia/monitor-core/pull/30.patch https://github.com/ganglia/monitor-core/pull/30.diff --- Reply to this email directly or view it on GitHub: https://github.com/ganglia/monitor-core/pull/30 -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] release/3.3 branch created
I don't really see a point in branching at this point. We have so few commiters and commits that having to maintain separate branches is at this time unwarranted. If this becomes an issue in the future I would address it at that time. Vladimir On Tue, 27 Mar 2012, Daniel Pocock wrote: I've created a new branch in git for 3.3 releases: https://github.com/ganglia/monitor-core/branches/release/3.3 daniel@lt2:~/ws/ganglia/ganglia-git$ git pull Already up-to-date. daniel@lt2:~/ws/ganglia/ganglia-git$ git branch release/3.3 daniel@lt2:~/ws/ganglia/ganglia-git$ git branch * master release/3.3 daniel@lt2:~/ws/ganglia/ganglia-git$ git push origin release/3.3 Total 0 (delta 0), reused 0 (delta 0) To ssh://g...@github.com/ganglia/monitor-core.git * [new branch] release/3.3 - release/3.3 Further 3.3.x releases should be made from the branch, and tags should be made on the branch (I've updated the release procedure with comments about this) Any bug fixes or new features should continue to go on master, and then they can be selectively merged into the release branch if necessary. I would propose that the merge policy is not too restrictive: while a tarball is in release candidate status (e.g. 3.3.5 at present), only the release manager should merge/backport from master. At all other times, any committer can merge bug fixes onto the release branch. I'd also propose the same for the web repo, but I haven't created any branch there yet as I'd like to get feedback on the strategy first -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] release/3.3 branch created
I understand that. What I am trying to say is that it adds complexity since you have to merge things back to branches test it etc. This adds additional overhead to our already thin resources. We had a 3.0 branch and there was talk of needing to release 3.0.8 but it never came about due to lack of resources/disinterest. Therefore I'd like to dump branches for now and just stay on mainline. Vladimir On Tue, 27 Mar 2012, Daniel Pocock wrote: On 27/03/2012 15:28, Vladimir Vuksan wrote: In my mind it doesn't. It just adds the job of merging down the line. I prefer to work on the trunk since that forces you to actually make things work and not defer changes since you have them sitting on a branch. I understand there are pros and cons to both approaches. I just don't see that as a problem that needs to be solved at this time. I would prefer we spent our time of fixing/improving things instead of process. I am only talking about release branches, not feature branches A feature branch is something I create (e.g. to overhaul the autotools stuff) and I am responsible for merging that in to master (trunk) later when I think it is ready. Thanks to the wonders of git, people are free to do that if and when they please. In contrast, a release branch basically draws a line underneath a minor release (e.g. 3.3 series). It never has to be merged back to master. Typically, bug fixes are done on master, and then the bug fix is selectively merged into the release branch, but only if it meets certain criteria: - only if the bug fix relates to the release - only if the bug is over some threshold of severity Example timeline: new feature A branch 3.3 release 3.3.0 new feature B branch 3.4 release 3.4.0 find and fix bug in B (b1) release 3.4.1 find and fix bug in A (b2) release 3.3.1 and 3.4.2 In the above example, notice that nobody has to bother merging the fix for (b1) back into the release/3.3 branch -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.3.1 configure.in broken, 3.3.2 needed
You will need to tag the monitor-core release then run scripts/package-ganglia-release 3.3.2 from monitor-core. It will pull in the ganglia-web submodule in the tree. Vladimir On Sat, 10 Mar 2012, Daniel Pocock wrote: On 09/03/12 16:57, Daniel Pocock wrote: On 09/03/12 15:42, Carlo Marcelo Arenas Belon wrote: On Thu, Mar 08, 2012 at 04:34:19PM +0100, Daniel Pocock wrote: Michael, do you have write access on the wiki? I think we need to get this distribution-specific stuff captured there along with the general notes I provided below. having this instructions added to the codebase just like README.WIN is could help too, specialy considering there is a fair ammount of confusion now with information (not all of it consistent with each other) between the multiple wikis and website. I will definitely add my notes to either the github wiki or the codebase - but not both. I think the wiki is better because it is easier to format things (e.g. code blocks) Done: https://github.com/ganglia/monitor-core/wiki/BuildingARelease Can someone add in the other steps that have been mentioned for adding the web/* stuff to the tarball? Are there any tests that anyone would like to add for checking a tarball before release? The github wiki is not so friendly with numbering and nested lists, notice how the last two items get indented when they should not be? I spent more time trying to figure that out than writing the steps, so maybe we should use Moin or some other wiki, I've found that much more friendly. -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.3.1 configure.in broken, 3.3.2 needed
Actually tag will cover ganglia-web as basically web submodule is a pointer to a particular version of ganglia-web so when you tag monitor-core it contains pointer to right version of ganglia-web. I would not be in favor of putting it back in monitor-core. I think we should really keep the pieces separate. If you feel package-ganglia-release is inadequate feel free to change it ;-). Vladimir On Sat, 10 Mar 2012, Daniel Pocock wrote: I'm a little nervous about this for a couple of reasons: a) the tag doesn't cover the ganglia-web stuff b) the tag is created before testing (which is not necessary when using git, you can tag after you test, because a tag is just a checksum of what you tested) c) package-ganglia-release does things that can be done for us by autotools `make dist' logic - in fact, if `make dist' was used, the problem with the version numbers within configure.in would have been obvious Given that everyone is 100% committed to the new web UI, can I propose that we do a subtree merge to bring monitor-core and web into the same repo? http://progit.org/book/ch6-7.html This will preserve all the history of the web2 branch I just feel that Ganglia already diverges from autotools best practice in a number of ways (e.g. the nested configure for libmetrics, or the way --with-libsomething is used) and that if things can be simplified it will make the process less tedious and more effort can go into testing, etc. Regards, Daniel On 10/03/12 20:14, Vladimir Vuksan wrote: You will need to tag the monitor-core release then run scripts/package-ganglia-release 3.3.2 from monitor-core. It will pull in the ganglia-web submodule in the tree. Vladimir On Sat, 10 Mar 2012, Daniel Pocock wrote: On 09/03/12 16:57, Daniel Pocock wrote: On 09/03/12 15:42, Carlo Marcelo Arenas Belon wrote: On Thu, Mar 08, 2012 at 04:34:19PM +0100, Daniel Pocock wrote: Michael, do you have write access on the wiki? I think we need to get this distribution-specific stuff captured there along with the general notes I provided below. having this instructions added to the codebase just like README.WIN is could help too, specialy considering there is a fair ammount of confusion now with information (not all of it consistent with each other) between the multiple wikis and website. I will definitely add my notes to either the github wiki or the codebase - but not both. I think the wiki is better because it is easier to format things (e.g. code blocks) Done: https://github.com/ganglia/monitor-core/wiki/BuildingARelease Can someone add in the other steps that have been mentioned for adding the web/* stuff to the tarball? Are there any tests that anyone would like to add for checking a tarball before release? The github wiki is not so friendly with numbering and nested lists, notice how the last two items get indented when they should not be? I spent more time trying to figure that out than writing the steps, so maybe we should use Moin or some other wiki, I've found that much more friendly. -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.3.1 configure.in broken, 3.3.2 needed
I know what you are saying :-). What I was trying to say is that if you feel there is a better way and you are willing to make the changes to go ahead :-). I am not married to package-ganglia-release so anything that helps us long term is a win. Vladimir On Sat, 10 Mar 2012, Daniel Pocock wrote: `inadequate' is not what I am suggesting. Nor am I saying it is broken. Rather, I think it adds extra steps that are possibly avoidable. When I build a ganglia-modules-linux release, everything is done for me by autotools: git checkout autoreconf --install ./configure make dist I just have to make sure that every Makefile.am declares the files that belong in distributions (and occasionally I forget to do that and release something with man page missing, etc). `make dist' reads the metadata and does the work. I actually believe that the main configure.in for Ganglia could be simplified down to little more than what I have in ganglia-modules-linux/configure.ac: http://sourceforge.net/p/gmod-linux/code/ci/2dbf3ee8223e538d6358bc73ebf7912a97796fd0/tree/configure.ac?force=True I'm not sure about whether to overhaul things like this for the 3.3.2 release, but this is a pattern I've repeated for a number of projects now and I really feel it saves me time (and potentially anyone else who wants to make a release). E.g. you can see I do it exactly the same way in dynalogin, and the configure file there is even more basic, in fact, it's as basic as it can possibly be: http://dynalogin.git.sourceforge.net/git/gitweb.cgi?p=dynalogin/dynalogin;a=blob_plain;f=configure.ac;hb=HEAD Regards, Daniel On 10/03/12 21:37, Vladimir Vuksan wrote: Actually tag will cover ganglia-web as basically web submodule is a pointer to a particular version of ganglia-web so when you tag monitor-core it contains pointer to right version of ganglia-web. I would not be in favor of putting it back in monitor-core. I think we should really keep the pieces separate. If you feel package-ganglia-release is inadequate feel free to change it ;-). Vladimir On Sat, 10 Mar 2012, Daniel Pocock wrote: I'm a little nervous about this for a couple of reasons: a) the tag doesn't cover the ganglia-web stuff b) the tag is created before testing (which is not necessary when using git, you can tag after you test, because a tag is just a checksum of what you tested) c) package-ganglia-release does things that can be done for us by autotools `make dist' logic - in fact, if `make dist' was used, the problem with the version numbers within configure.in would have been obvious Given that everyone is 100% committed to the new web UI, can I propose that we do a subtree merge to bring monitor-core and web into the same repo? http://progit.org/book/ch6-7.html This will preserve all the history of the web2 branch I just feel that Ganglia already diverges from autotools best practice in a number of ways (e.g. the nested configure for libmetrics, or the way --with-libsomething is used) and that if things can be simplified it will make the process less tedious and more effort can go into testing, etc. Regards, Daniel On 10/03/12 20:14, Vladimir Vuksan wrote: You will need to tag the monitor-core release then run scripts/package-ganglia-release 3.3.2 from monitor-core. It will pull in the ganglia-web submodule in the tree. Vladimir On Sat, 10 Mar 2012, Daniel Pocock wrote: On 09/03/12 16:57, Daniel Pocock wrote: On 09/03/12 15:42, Carlo Marcelo Arenas Belon wrote: On Thu, Mar 08, 2012 at 04:34:19PM +0100, Daniel Pocock wrote: Michael, do you have write access on the wiki? I think we need to get this distribution-specific stuff captured there along with the general notes I provided below. having this instructions added to the codebase just like README.WIN is could help too, specialy considering there is a fair ammount of confusion now with information (not all of it consistent with each other) between the multiple wikis and website. I will definitely add my notes to either the github wiki or the codebase - but not both. I think the wiki is better because it is easier to format things (e.g. code blocks) Done: https://github.com/ganglia/monitor-core/wiki/BuildingARelease Can someone add in the other steps that have been mentioned for adding the web/* stuff to the tarball? Are there any tests that anyone would like to add for checking a tarball before release? The github wiki is not so friendly with numbering and nested lists, notice how the last two items get indented when they should not be? I spent more time trying to figure that out than writing the steps, so maybe we should use Moin or some other wiki, I've found that much more friendly. -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud
Re: [Ganglia-developers] Ganglia 3.3.1 configure.in broken, 3.3.2 needed
Perhaps best thing is to fork the repo on Github and submit a pull request. Thanks, Vladimir On Thu, 8 Mar 2012, Michael Perzl wrote: If you do an update to 3.3.2 could you also please make sure that the following files exist: ChangeLog libmetrics/ChangeLog libmetrics/INSTALL As with the 3.3.1 tar.gz file they don't exist thus preventing a autoreconf -fiv that I need to perform for all my additional Ganglia modules. Here is the code snippet from one of the SPEC files. This was not necessary with any previous version before. ## ## PREP ## %prep %setup -q -n ganglia-%{version} export PATH=/opt/freeware/bin:$PATH # apply all necessary AIX patches %patch0 %patch1 # apply the patch for the mod_ibmpower module %patch2 # autoreconf seems to need this one touch ChangeLog libmetrics/ChangeLog libmetrics/INSTALL ## ## BUILD ## %build export CC=xlc_r -U_AIX43 export LDFLAGS=-L/opt/freeware/lib -Wl,-bmaxdata:0x8000 -Wl,-brtl autoreconf -fiv ./configure \ Thanks. Regards, Michael On 03/08/2012 03:33 PM, Daniel Pocock wrote: I notice that configure.in was only updated to 3.3.1 after the package was put out on Sourceforge This breaks the OpenCSW package build and may impact other people too Can I propose a 3.3.2 release? I was going to add a release manager document on the wiki, but I don't have write access (can someone please help me with that). Here are the steps that I use with ganglia-modules-linux, I believe it is the same for Ganglia now that git is in use, but any further feedback would be helpful: a) review the changes from the last release (git diff 3.3.1 3.3.2) - look for anything that might impact binary compatibility with existing 3rd party modules, etc b) run git log (from the previous release) and note all the changes, add them to the changelog (where is it now? couldn't find it in git for monitor-core) c) update monitor-core/configure.in, in particular: GANGLIA_MAJOR_VERSION=3 GANGLIA_MINOR_VERSION=3 GANGLIA_MICRO_VERSION=2 and commit that change together with change log: git add configure.in Changelog git commit -m 'Prepare v3.3.2 release' git push d) clone the repo into a fresh directory, bootstrap, build a tarball: git clone git:///github/ganglia ganglia-dist cd ganglia-dist ./bootstrap ./configure make dist e) test the tarball f) if the tarball is good, tag the clone git tag -s -m 'Tag v3.3.2' 3.3.2 git push --tags g) get a checksum of the tarball sha256sum ganglia-3.3.2.tar.gz h) upload the tarball to sourceforge i) announce it on the mailing list, publish both the checksum and the commit number of the tag, sign the email with the same PGP key used to tag j) update other web sites (e.g. ganglia.info) I'm sure that other optional steps could be added (e.g. more tests to run on the tarball prior to distribution, building binary packages for Debian/RH,...) but the steps above are probably the essential ones -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.3.1 configure.in broken, 3.3.2 needed
On Thu, 8 Mar 2012, Daniel Pocock wrote: On 08/03/12 16:21, Vladimir Vuksan wrote: Yes. I was thinking we need to release 3.3.2. I don't mind helping out with it, but it would be good to document the procedure some more first One thing I just noticed is that monitor-core/web is now empty, so I'm not sure how you bring together the gmond and web stuff to make a single distribution tarball - is it still possible with `make dist', is there another script for building a release, or it is a manual process for the moment? Under scripts/ directory there is a file called package-ganglia-release. I use that. We are using git submodule which pulls in ganglia-web. That is why web/ is empty. Vladimir -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.3.1 configure.in broken, 3.3.2 needed
Daniel, I just finished commiting my changes for ganglia web 3.3.2 so if you want to tag monitor-core as 3.3.2 and package it up that would be great. Vladimir On Thu, 8 Mar 2012, Daniel Pocock wrote: Michael, do you have write access on the wiki? I think we need to get this distribution-specific stuff captured there along with the general notes I provided below. I will do the same for the OpenCSW process, then if one of us gets hit by a bus, the releases can live on Regards, Daniel On 08/03/12 16:26, Michael Perzl wrote: If you do an update to 3.3.2 could you also please make sure that the following files exist: ChangeLog libmetrics/ChangeLog libmetrics/INSTALL As with the 3.3.1 tar.gz file they don't exist thus preventing a autoreconf -fiv that I need to perform for all my additional Ganglia modules. Here is the code snippet from one of the SPEC files. This was not necessary with any previous version before. ## ## PREP ## %prep %setup -q -n ganglia-%{version} export PATH=/opt/freeware/bin:$PATH # apply all necessary AIX patches %patch0 %patch1 # apply the patch for the mod_ibmpower module %patch2 *# autoreconf seems to need this one touch ChangeLog libmetrics/ChangeLog libmetrics/INSTALL* ## ## BUILD ## %build export CC=xlc_r -U_AIX43 export LDFLAGS=-L/opt/freeware/lib -Wl,-bmaxdata:0x8000 -Wl,-brtl autoreconf -fiv ./configure \ Thanks. Regards, Michael On 03/08/2012 03:33 PM, Daniel Pocock wrote: I notice that configure.in was only updated to 3.3.1 after the package was put out on Sourceforge This breaks the OpenCSW package build and may impact other people too Can I propose a 3.3.2 release? I was going to add a release manager document on the wiki, but I don't have write access (can someone please help me with that). Here are the steps that I use with ganglia-modules-linux, I believe it is the same for Ganglia now that git is in use, but any further feedback would be helpful: a) review the changes from the last release (git diff 3.3.1 3.3.2) - look for anything that might impact binary compatibility with existing 3rd party modules, etc b) run git log (from the previous release) and note all the changes, add them to the changelog (where is it now? couldn't find it in git for monitor-core) c) update monitor-core/configure.in, in particular: GANGLIA_MAJOR_VERSION=3 GANGLIA_MINOR_VERSION=3 GANGLIA_MICRO_VERSION=2 and commit that change together with change log: git add configure.in Changelog git commit -m 'Prepare v3.3.2 release' git push d) clone the repo into a fresh directory, bootstrap, build a tarball: git clone git:///github/ganglia ganglia-dist cd ganglia-dist ./bootstrap ./configure make dist e) test the tarball f) if the tarball is good, tag the clone git tag -s -m 'Tag v3.3.2' 3.3.2 git push --tags g) get a checksum of the tarball sha256sum ganglia-3.3.2.tar.gz h) upload the tarball to sourceforge i) announce it on the mailing list, publish both the checksum and the commit number of the tag, sign the email with the same PGP key used to tag j) update other web sites (e.g. ganglia.info) I'm sure that other optional steps could be added (e.g. more tests to run on the tarball prior to distribution, building binary packages for Debian/RH,...) but the steps above are probably the essential ones -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Adding trend lines to your graphs
In the upcoming 3.3.2 you will be able to add trend lines to your metric graphs. More here http://ganglia.info/?p=497 Vladimir -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia-3.3.1: How to get back the old web interface ??
Old code is in the repository. You can check it out any time you want. It is just no longer supported. There was no pressing reason it was just that most people preferred the new interface. If you'd like to support the legacy web you are more than welcome to do so. Vladimir On Mon, 5 Mar 2012, Martin Knoblauch wrote: - Original Message - That is simple. But IMHO w e should keep the old code in the repository and maybe even build RPMs for it. What speaks against putting the code back as legacy-web?. But frankly, I would have preferred that the new code was stored as gweb-2 and the old code kept as web. Or was there a real pressing reason for the reorganization.-- Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond unable to load plugin
Try invoking any of the modules in /usr/lib/ganglia/python_modules by hand. See if anything errors out. Vladimir On Thu, 1 Mar 2012, Aswad Rangnekar wrote: Hi, I am new to ganglia, and trying to setup the gmond_python_modules for added support. These modules reside at /usr/lib/ganglia/python_modules/ The conf files reside at : /etc/ganglia/conf.d/ I have added entries to include pyconf in /etc/ganglia/conf.d/modpython.conf On gmond start I get the error as shown on this dpaste url : http://pastebin.com/4r4reVKG Using python gmond modules from : https://github.com/ganglia/monitor-core/tree/master/gmond/python_modules Help please! Thanks Regards, Aswad Rangnekar ext : 389 -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia-3.3.1: How to get back the old web interface ??
If you'd like to rework the templates to reinstate the old behavior ie. call it legacy templates that would be fine. Vladimir On Wed, 29 Feb 2012, Martin Knoblauch wrote: On Tue, Feb 28, 2012 at 2:41 AM, Martin Knoblauch kn...@knobisoft.de wrote: While I think it is an interesting development, I do not think it is ready for general consuption (more later). Big question: is there a way to configure back the old behaviour?? You can just download the old source, build the web RPM and install that instead of what comes with the new version. It should just work. Hi Bernard, hey - while correct and obvious, this is not what I asked for :-) I just think it would be good to have a config option that brings back the interface to the old look/simplicity/speed.-- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Non-public API functions used by multicpu and others
That will need to go into 3.3.2. I tagged 3.3.1 today. http://ganglia.info/?p=495 Vladimir On Wed, 8 Feb 2012, Daniel Pocock wrote: In repackaging mod_multicpu as part of ganglia-modules-linux, I've noticed that it uses two non-public API functions and one #define: char *skip_whitespace (const char *p); char *skip_token (const char *p); #define SYNAPSE_FAILURE -1 Co-incidentally, the IO module uses these too Should these go in the public API of Ganglia, in other words, in the include files that are deployed to /usr/include? Is there a compelling reason not to do that? I'm guessing they were just overlooked when the module architecture was introduced and they could be migrated to the public API in the next minor release (e.g. 3.3.1) As a workaround, to enable packaging of ganglia-modules-linux, I've just copied those items into a header file in the module project: https://sourceforge.net/p/gmod-linux/code/ci/f6af30aa37cc870d3169276d9f6743771baf1b90/tree/include/ganglia_mod_workaround.h -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia 3.3.0 released
This was gonna be the 4.0.0 release however we received feedback that making a major version bump may get cause issues with various Linux distribution packaging policies e.g. Fedora. Therefore it's been rebranded as 3.3.0. Announcement is here http://ganglia.info/?p=489 Enjoy, Vladimir -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Protocol Efficiency Ideas
I don't get it. JSON is a notation that has nothing to do with Linux. I think addition of JSON will be fantastic and look forward to including it. Vladimir On Fri, 27 Jan 2012, Im Root wrote: I forgot to add that by adding json, you will be restricting the types of Linux that this platform runs on. Although it may be a nice to have feature for a few developers who may want to customize things, this puts an additional dependency on the platform. You will be restricted to using only the flavors of linux that the keepers of json feel like using. Sorry, but adding the json dependency is a completely boneheaded move. From: Im Root imr...@rocketmail.com To: Dave Rawks d...@pandora.com; ganglia-developers@lists.sourceforge.net ganglia-developers@lists.sourceforge.net Sent: Friday, January 27, 2012 9:59 AM Subject: Re: [Ganglia-developers] Protocol Efficiency Ideas I believe that adding json would be a mistake. The reason is that when users install the main package there would be now a dependency on having json installed. It just adds to the complexity and helps to perpetuate RPM hell. I've had to deal with installing json in the past and it's been awful. It may be nice for a developer but not so nice for the end users. From: Dave Rawks d...@pandora.com To: ganglia-developers@lists.sourceforge.net Sent: Thursday, January 26, 2012 2:34 PM Subject: [Ganglia-developers] Protocol Efficiency Ideas Hey All, We've been talking about adding json in addition to xml for the tcp listen port exchange format. And I was curious if the EXTRA_DATA subtree to the XML ever contains something aside from EXTRA_ELEMENTS and if the EXTRA_ELEMENTS ever have attributes aside from NAME and VAL. Just doing some back of napkin calculations it looks like reducing this portion of the xml from: METRIC NAME=swap_free VAL=47872928 TYPE=float UNITS=KB TN=24 TMAX=180 DMAX=0 SLOPE=both EXTRA_DATA EXTRA_ELEMENT NAME=GROUP VAL=memory/ EXTRA_ELEMENT NAME=DESC VAL=Amount of available swap memory/ EXTRA_ELEMENT NAME=TITLE VAL=Free Swap Space/ /EXTRA_DATA /METRIC to: METRIC NAME=swap_free VAL=47872928 TYPE=float UNITS=KB TN=24 TMAX=180 DMAX=0 SLOPE=both EXTRA_DATA GROUP=memory DESC=Amount of available swap memory TITLE=Free Swap Space/ /METRIC would be quite a savings over the wire. Of course this would break compatibility with anything that currently exchanges xml with ganglia monitor. But... That gets me back to json... The current data structure from xml to json is something like this: {metric:{name:'swap_free',val:47872928,type:'float',units:'KB',tn:24,tmax:180,dmax:0,slope:'both',extra_data:{extra _element:[{name:'GROUP',val:'memory'},{name:'DESC',val:'Amount of available swap memory'},{name:'TITLE',val:'Free Swap Space'}]}}} While the collapsed version ends up being this tiny json blob: {metric:{name:'swap_free',val:47872928,type:'float',units:'KB',tn:24,tmax:180,dmax:0,slope:'both',extra_data:{group :'memory',desc:'Amount of available swap memory',title:'Free Swap Space'}}} On a relatively small cluster with a dozen metrics and a handful of hosts the savings are minor. However on a cluster of hundreds of hosts with perhaps dozens of metrics the savings would equate to MBs of data per tcp fetch. And the parse speed of the json /should/ be much faster as well. Any comments/questions/ideas? -Dave -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro
[Ganglia-developers] Ganglia 4.0.0 Testing
I have tagged and built a tarball for Ganglia 4.0.0 https://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/testing/ I have been compiling release notes here https://github.com/ganglia/monitor-core/wiki/Release-Notes I could use help with testing and documentation. Thank you, Vladimir -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 4.0.0 Testing
Great. Can you clone the Wiki then submit a pull request with the changes. Feel free to include a link to your blog. Thanks, Vladimir https://github.com/ganglia/monitor-core/wiki/_access On Fri, 20 Jan 2012, Peter Phaal wrote: Vladimir, It's great to see the progress with Ganglia 4.0.0! Once comment on the release notes, sFlow HTTP metrics are also supported (in addition to the Virtual machine, Java VM and Memcache metrics that are listed). FYI For anyone interested in monitoring web server farms, sFlow agents currently exist for Apache, NGINX, Tomcat and node.js. Ganglia screen captures showing the metrics and instructions for configuring gmond are on the sFlow blog: http://blog.sflow.com/2011/12/using-ganglia-to-monitor-web-farms.html Cheers, Peter On Jan 20, 2012, at 9:19 AM, Vladimir Vuksan wrote: I have tagged and built a tarball for Ganglia 4.0.0 https://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/testing/ I have been compiling release notes here https://github.com/ganglia/monitor-core/wiki/Release-Notes I could use help with testing and documentation. Thank you, Vladimir -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia Web repo moved
Sounds good. Let's call it 4.0.0. Who is gonna be the packager :-) ? Vladimir On Tue, 17 Jan 2012, Daniel Pocock wrote: On 17/01/12 19:59, Im Root wrote: We should also bundle the new monitor core with a version of gmond that runs on Windows 2008 r2 as well. The ganglia web front end loses some of it's monitoring functionality when it can't monitor basic operating systems. Besides, it only makes sense to keep ALL of the basic components up to date which includes the collectors too. A nice web UI is useless without all of the data. I agree - if Mr Root is developing a fix for that issue and it will be ready this month, the release can be delayed for 1-2 weeks to let people test Also, I think that instead of calling it Ganglia 3.2.1, it should be Ganglia 3.3.0, because the web interface is more than just a bug fix release. It could even be 4.0.0 perhaps? -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia Web repo moved
Apparently gmond on W2008 crashes systems. I do not have any W2008 systems so can't test it. I would just note in release notes that Windows 2008 is not supported. That said I am not in favor of holding off the release. If someone wants to contribute Windows patches we can release that as 4.0.1 or later. Vladimir On Tue, 17 Jan 2012, Daniel Pocock wrote: On 17/01/12 20:40, Im Root wrote: I would like to politely correct Dan in that I was proposing the gmond fix as opposed to providing it. I seriously think that all of the new releases should be suspended until this critical bug has been addressed. It does crash systems. Ok, so maybe we can kill two birds with one stone... put in some code to identify unsupported platforms (like Solaris zones or W2008) and give a polite notice rather than a crash? That way, a developer who tries to deploy in in one of those environments might see the opportunity to contribute a fix -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia Web repo moved
Point taken. Are you volunteering to fix it ? I don't think anyone is disagreeing with you. Where is this data collection module supplied btw ? Thanks, Vladimir On Tue, 17 Jan 2012, Im Root wrote: This needs to be included with the ganglia package then. Otherwise ganglia will become a hodge podge of miscellaneous open source junk (like openstack). Its kind of like building a Mercedes with 3 wheels and including a spare that's a bicycle wheel. It's nice to look at but still not quite right. When you supply a data collection module that crashes systems, then it needs to be fixed. __ From: Nick Satterly nfsatte...@gmail.com To: Daniel Pocock dan...@pocock.com.au Cc: Im Root imr...@rocketmail.com; ganglia-developers@lists.sourceforge.net ganglia-developers@lists.sourceforge.net Sent: Tuesday, January 17, 2012 3:33 PM Subject: Re: [Ganglia-developers] Ganglia Web repo moved Fixing gmond should be bumped up to the highest priority. This is not critical because there is an entirely adequate work around. It's been mentioned before. Check out the host sflow agent. It works on every flavour of windows. --Nick. On 17 Jan 2012, at 20:26, Daniel Pocock dan...@pocock.com.au wrote: Fixing gmond should be bumped up to the highest priority. -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Web repo moved
I have moved Ganglia Web repo I have been working on under the Ganglia Github account. https://github.com/ganglia/ganglia-web Speaking of Ganglia Web we should release a new version of Ganglia with the new web frontend included. I propose following 1. Remove web-frontend from monitor-core 2. Allow Ganglia Web to be versioned separately for those that want to get more up to date web UI. We can call that gweb 3. Bundle Ganglia such as that it bundles monitor-core and ganglia-web together. In the release notes than specify that e.g. Ganglia 3.2.1 is monitor-core 3.2.1 and ganglia-web 2.2.1. Does this sound reasonable ? That said anyone want to step up and be the release manager for 3.2.1 ? Vladimir -- RSA(R) Conference 2012 Mar 27 - Feb 2 Save $400 by Jan. 27 Register now! http://p.sf.net/sfu/rsa-sfdev2dev2 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Even MORE interesting project - Fix gmond on Windows Server 2008 R2
Best fixes are provided by users scratching their own itches :-). This is speaking from my own experience. Perhaps you should take up the challenge ? As far as Windows is concerned you may be best of use host-sflowd. That is what I use http://blog.sflow.com/2010/10/installing-host-sflow-on-windows-server.html Vladimir On Tue, 10 Jan 2012, Im Root wrote: Hey this would be a great to benefit lots of users. Apparently the gmond program goes into an indefinite CPU loop on this operating system. (Nobody really cares if you make gmond run on zero/MQ pub sub anyways.) In fact this would be of more benefit than writing a ganglia book or putting the code in GitHub. I challenge anyone out there with coding skills to fix this once and for all. (As a bonus, make it so you don't have to use Cygwin either.) This would be a huge help! -- Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Even MORE interesting project - Fix gmond on Windows Server 2008 R2
There is really no new infrastructure to deploy. All you need is gmond 3.2.0 on the receiving end and host-sflowd on Windows servers. As far as reaching out to experts I haven't really seen much interest in fixing gmond for Windows. Vladimir On Tue, 10 Jan 2012, Im Root wrote: I'm not interested in deploying a completely new infrastructure. I already have an extensive ganglia setup and can see all my servers except for a growing number of Win2k8r2 servers. If we can get them to report to ganglia without locking up the cpu, then the problem is solved. I posed the problem here hoping to reach out to those who may be truly experts with these programs and who could get a fix done much faster than I. As for your rash issues, Vlad, I would suggest you consult a physician. :-) __ From: Vladimir Vuksan vli...@veus.hr To: ganglia-developers@lists.sourceforge.net ganglia-developers@lists.sourceforge.net Sent: Tuesday, January 10, 2012 4:27 PM Subject: Re: [Ganglia-developers] Even MORE interesting project - Fix gmond on Windows Server 2008 R2 Best fixes are provided by users scratching their own itches :-). This is speaking from my own experience. Perhaps you should take up the challenge ? As far as Windows is concerned you may be best of use host-sflowd. That is what I use http://blog.sflow.com/2010/10/installing-host-sflow-on-windows-server.html Vladimir On Tue, 10 Jan 2012, Im Root wrote: Hey this would be a great to benefit lots of users. Apparently the gmond program goes into an indefinite CPU loop on this operating system. (Nobody really cares if you make gmond run on zero/MQ pub sub anyways.) In fact this would be of more benefit than writing a ganglia book or putting the code in GitHub. I challenge anyone out there with coding skills to fix this once and for all. (As a bonus, make it so you don't have to use Cygwin either.) This would be a huge help! -- Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Interesting new project: Put Gmond info on a ZeroMQ pub/sub
Patrick Debois has kicked of an interesting set of projects to put metric information on a common bus. For example he has implemented a ruby based daemon that parses Ganglia gmond packets and puts them on a ZeroMQ pub/sub bus. Once it's there you can subscribe with a client of your choice and do transforms to the data e.g. - feed graphite or another monitoring tool - insert data into a SQL database - feed Nagios using passive checks Thanks to Patrick for a great idea and implementation. Now let's get to work on writing good subscribers. Gmond-zmq https://github.com/jedi4ever/gmond-zmq Statsd-zmq https://github.com/jedi4ever/statsd -- Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure access to virtual desktops. With this all-in-one solution, easily deploy virtual desktops for less than the cost of PCs and save 60% on VDI infrastructure costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia Web 2.2.0 released
Ganglia Web 2.2.0 has been released. Announcement is here http://ganglia.info/?p=479 Noteable changes are described here http://ganglia.info/?p=464 Thanks to Peter Piela and Jeff Buchbinder for their vast amount of contributions to this release. Vladimir -- Systems Optimization Self Assessment Improve efficiency and utilization of IT resources. Drive out cost and improve service delivery. Take 5 minutes to use this Systems Optimization Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia web 2.* dependencies and packaging issues
I am coming to see it your way. I agree we should simply include the latest web in the Ganglia tree. Call it Ganglia 3.3.0 and release it. Ganglia Web 2.2.0 is nearly ready. I think it will be released either tomorrow or after tomorrow. Jeff also packaged up 3.2.1 which contains fixes for broken gstat and optionally configured graphite emitter. If you have time give it a shot from here https://github.com/ganglia/monitor-core/downloads Vladimir On Tue, 13 Dec 2011, Daniel Pocock wrote: Moving this discussion to ganglia-developers The only feedback I received was that Ganglia Web 2 changes more frequently than the rest of Ganglia However, I feel that is not a sufficient justification to keep Ganglia Web 2 separate from the main Ganglia bundle. On the other side of the argument, I feel that Ganglia Web 2 should be part of the main tarball for various reasons: 1) many users who just have a side-interest in monitoring will just use whatever comes as default 2) must make life easier for people who build the official packages (e.g. Debian, RPM, OpenCSW) - these people shouldn't spend time packaging, testing and supporting two web interfaces Given that there is a significant lead time for releases to propagate into Linux distributions, and then it gets stuck on their distribution DVDs for 12-18 months, I believe the decision and a new release should come quickly and I don't mind assisting with it - does anyone object? On 29/11/11 10:24, Daniel Pocock wrote: On 29/11/11 10:04, Vladimir Vuksan wrote: Yes the UI should work with 3.0+ gmetad. When we first released the 2.0 interface some people expressed the desire to run the old interface along side the new one due to certain integrations that are not supported. I suppose we could/should ditch the old interface at this point. Ok, so maybe the versions could be merged like this: ganglia 3.2 + gweb 2.(current) = ganglia 3.3.0 This would make it much easier for packaging efforts - OpenCSW, Debian, the spec file for building RPMs, all of these platforms have linked the Ganglia web version with the overall version number To be more specific, it may be easy to: svn copy branches/3.2/monitor-core branches/monitor-core-3.3 svn delete branches/monitor-core-3.3/web svn copy monitor-web-2.0 branches/monitor-core-3.3/web svn copy branches/monitor-core-3.3 tags/3.3.0 Therefore, 3.3 will not include anything new from trunk, it will just be a clone of 3.2 but with the new web stuff Anything new from trunk would therefore continue to rest in trunk until the 3.4 series comes along -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ Ganglia-general mailing list ganglia-gene...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- Systems Optimization Self Assessment Improve efficiency and utilization of IT resources. Drive out cost and improve service delivery. Take 5 minutes to use this Systems Optimization Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Systems Optimization Self Assessment Improve efficiency and utilization of IT resources. Drive out cost and improve service delivery. Take 5 minutes to use this Systems Optimization Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Gauging interest in writing a Ganglia eBook
I would be interested in contributing. Vladimir On Wed, 7 Dec 2011, Matt Massie wrote: Are there any more volunteers? Later today, I'm going to submit the list of volunteers to O'reilly and start getting project off the ground. Going to be a lot of fun. -Matt On Fri, Dec 2, 2011 at 4:31 PM, Matt Massie m...@massie.us wrote: Brad- Can you open a new thread on the developer's list? I think there's going to be quite a bit of interest in a REST interface to Ganglia. It would be really useful to have. I know I've been tempted to write one myself. -Matt On Fri, Dec 2, 2011 at 4:21 PM, Vladimir Vuksan vli...@veus.hr wrote: I am sure lots of people would appreciate REST interface to Ganglia. Myself and Jeff Buchbinder have been talking on how we could implement it but if you already have it completed that would be an awesome addition ;-). Vladimir On 02.12.2011 10:45, Brad Nicholes wrote: Hey Matt, How are you? It's been a while. I know I haven't been biggest contributor to the Ganglia project lately but I still monitor the mailing lists and this book sounds like a great idea. Count me in anywhere I can help. On a slightly different note: I have managed to carve out a little time over the past few weeks to get back into a little Ganglia development. Since we are gauging interest, would anybody be interested in a REST interface for Ganglia? I have worked up a POC that allows a user to query metrics from gmetad through REST as well as pull data and graphs directly from the RRD files. I still have to get permission from my employer before I can contribute the REST code to the Ganglia project, but before I go to that effort I just wanted to see if this is something that the Ganglia community would be interested in. Brad On 12/1/2011 at 12:31 PM, in message CABcEujsJET24+hhHyVqAQ48aj_4YjfZsimGz=vmw06mnu86...@mail.gmail.com, Matt Massie m...@massie.us wrote: There's an O'reilly editor who's interested in publishing a ~50-page eBook on ganglia. I have no doubt the ganglia community would benefit from a book covering topics like: - Ganglia's components and overall architecture - Typical deployment configurations including simple steps for verifying an installation (e.g. unicast/multicast, single cluster/multiple distributed clusters/datacenter) - Navigating and using the new web interface - Tips for extending ganglia's functionality (e.g. gmetric, modules) - Common integration points (e.g. Hadoop metrics, Nagios) - A simple step-by-step checklist for debugging common ganglia issues with pointers to our web site, mailing lists, irc channel, etc. - Supported platforms and core metrics - Scaling to clusters 1000 nodes These are just ideas off the top of my head and not meant to final or comprehensive but meant to provide a list for discussion. Of course, let me know if there's topics the community would like to know more (or less) about. The purpose of the book is to serve as a first-read book for people new to ganglia. Keep in mind, for much of the book, we won't be starting from scratch. We already have a good amount of documentation that just needs to be organized and edited. I'll be happy to contribute time to make this eBook a reality; however, I want the book authors to be the leaders and experts in the ganglia community. I think it best we divide and conquer and write the book as a team. Who is interesting in helping write the book? -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/___ Ganglia-developers
Re: [Ganglia-developers] Gauging interest in writing a Ganglia eBook
I am sure lots of people would appreciate REST interface to Ganglia. Myself and Jeff Buchbinder have been talking on how we could implement it but if you already have it completed that would be an awesome addition ;-). Vladimir On 02.12.2011 10:45, Brad Nicholes wrote: Hey Matt, How are you? It's been a while. I know I haven't been biggest contributor to the Ganglia project lately but I still monitor the mailing lists and this book sounds like a great idea. Count me in anywhere I can help. On a slightly different note: I have managed to carve out a little time over the past few weeks to get back into a little Ganglia development. Since we are gauging interest, would anybody be interested in a REST interface for Ganglia? I have worked up a POC that allows a user to query metrics from gmetad through REST as well as pull data and graphs directly from the RRD files. I still have to get permission from my employer before I can contribute the REST code to the Ganglia project, but before I go to that effort I just wanted to see if this is something that the Ganglia community would be interested in. Brad On 12/1/2011 at 12:31 PM, in message CABcEujsJET24+hhHyVqAQ48aj_4YjfZsimGz=vmw06mnu86...@mail.gmail.com, Matt Massie m...@massie.us wrote: There's an O'reilly editor who's interested in publishing a ~50-page eBook on ganglia. I have no doubt the ganglia community would benefit from a book covering topics like: - Ganglia's components and overall architecture - Typical deployment configurations including simple steps for verifying an installation (e.g. unicast/multicast, single cluster/multiple distributed clusters/datacenter) - Navigating and using the new web interface - Tips for extending ganglia's functionality (e.g. gmetric, modules) - Common integration points (e.g. Hadoop metrics, Nagios) - A simple step-by-step checklist for debugging common ganglia issues with pointers to our web site, mailing lists, irc channel, etc. - Supported platforms and core metrics - Scaling to clusters 1000 nodes These are just ideas off the top of my head and not meant to final or comprehensive but meant to provide a list for discussion. Of course, let me know if there's topics the community would like to know more (or less) about. The purpose of the book is to serve as a first-read book for people new to ganglia. Keep in mind, for much of the book, we won't be starting from scratch. We already have a good amount of documentation that just needs to be organized and edited. I'll be happy to contribute time to make this eBook a reality; however, I want the book authors to be the leaders and experts in the ganglia community. I think it best we divide and conquer and write the book as a team. Who is interesting in helping write the book? -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Upcoming Ganglia Web features
I would like to wrap it up next week but it can wait. What's your timeline look like ? Vladimir On Sat, 26 Nov 2011, Alex Dean wrote: On Nov 24, 2011, at 9:40 PM, Vladimir Vuksan wrote: I just wrote up a blog post about upcoming Ganglia Web features http://ganglia.info/?p=464 If you have time and can help with writing documentation that would be greatly appreciated. Vladimir What's the timeline for this release? I have some changes for the auth system I've been working on, and if I can still get them in I'll try to get them ready. Main points: make configuration simpler, and allow per-view access (right now our ACL rules support only all/nothing access for all views). alex -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers