Re: [Ganglia-general] Grid of grids
Hello, I am having similar issues. A gmetad is running on another host, monitoring a cluster. Its grid name is GangliaGrid. I created a gmetad.conf on my laptop, with a tunnel to the remote host, forwarding its XML port (8652) to the localhost:8666. Telnet to localhost 8666 shows that the remote gmetad answers fine to queries like /... The local gmetad (on the laptop) uses in its conf: debug_level 1 setuid off gridname LocalGrid data_source GangliaGrid localhost:8666 trusted_hosts 127.0.0.1 xml_port 18651 interactive_port 18652 rrd_rootdir rrds Unfortunately I see only timeouts. Sources are ... Source: [GangliaGrid, step 15] has 1 sources 127.0.0.1 Data thread 140659923203840 is monitoring [GangliaGrid] data source 127.0.0.1 poll() timeout from source 0 for [GangliaGrid] data source after 0 bytes read If I use the same ganglia grid name in both gmetad's, gmetad dies immediately. Any idea what's going on? Best regards, Erich Am 23.06.2014 19:13, schrieb Rushton Martin: I've just updated to ganglia 3.6.0 and am trying to get a grid of grids configuration working. I started with the original grid which is visible through gweb. I built a new gmetad.conf for the new grid and that works fine when viewed from a new instance of gweb. The new grid uses port 8655 for XML and 8656 for queries. Using telnet to dump the XML also works fine and there are no obvious structural differences between it and the output from the original gmetad. However, when I add the line: data_source Management localhost:8655 to the master gmetad, not only do I see no output but I see repeated error messages in /var/log/messages: Process XML (Management): XML_ParseBuffer() error at line 1: no element found I tried scanning the archives and saw similar (but not identical) errors reported from a few years ago. I gather there was a known bug in gmetad that caused inputs from other gmetads to be rejected. So: 1)Is the bug still there? 2)Is there a work around? 3)Have I misconfigured it? Any advice gratefully received. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Grid of grids
Errr, my mistake. I need to forward the XML port, not the interactive port. After forwarding 8651 all works fine! Regards, Erich Am 22.07.2014 15:44, schrieb Erich Focht: Hello, I am having similar issues. A gmetad is running on another host, monitoring a cluster. Its grid name is GangliaGrid. I created a gmetad.conf on my laptop, with a tunnel to the remote host, forwarding its XML port (8652) to the localhost:8666. Telnet to localhost 8666 shows that the remote gmetad answers fine to queries like /... The local gmetad (on the laptop) uses in its conf: debug_level 1 setuid off gridname LocalGrid data_source GangliaGrid localhost:8666 trusted_hosts 127.0.0.1 xml_port 18651 interactive_port 18652 rrd_rootdir rrds Unfortunately I see only timeouts. Sources are ... Source: [GangliaGrid, step 15] has 1 sources 127.0.0.1 Data thread 140659923203840 is monitoring [GangliaGrid] data source 127.0.0.1 poll() timeout from source 0 for [GangliaGrid] data source after 0 bytes read If I use the same ganglia grid name in both gmetad's, gmetad dies immediately. Any idea what's going on? Best regards, Erich Am 23.06.2014 19:13, schrieb Rushton Martin: I've just updated to ganglia 3.6.0 and am trying to get a grid of grids configuration working. I started with the original grid which is visible through gweb. I built a new gmetad.conf for the new grid and that works fine when viewed from a new instance of gweb. The new grid uses port 8655 for XML and 8656 for queries. Using telnet to dump the XML also works fine and there are no obvious structural differences between it and the output from the original gmetad. However, when I add the line: data_source Management localhost:8655 to the master gmetad, not only do I see no output but I see repeated error messages in /var/log/messages: Process XML (Management): XML_ParseBuffer() error at line 1: no element found I tried scanning the archives and saw similar (but not identical) errors reported from a few years ago. I gather there was a known bug in gmetad that caused inputs from other gmetads to be rejected. So: 1) Is the bug still there? 2) Is there a work around? 3) Have I misconfigured it? Any advice gratefully received. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Grid of Grids Broken Again in 3.6.0? Is this a different problem?
Hi Illydth, You might have missed that the pull request that added the break back also added more logic to the endElement_GRID() function to fix double-writing of the last cluster. So yes, that break is meant to be there again. See https://github.com/ganglia/monitor-core/pull/73 However, what isn't clear is why there is a new grid-of-grids problem. I suspect that it relates to this pull request but I haven't been able to confirm this yet. See https://github.com/ganglia/monitor-core/pull/92 Regards, Nick On Fri, Sep 20, 2013 at 7:41 PM, Douglas Wagner dougla...@gmail.com wrote: So the last time I tried this upgrade thing (3.1.7 - 3.4.0) I was getting no grid of grids information. Ran across the fix with the help of others on the list and documented it here: http://sourceforge.net/apps/phpbb/ganglia/viewtopic.php?f=4t=16p=28 So now I've upgraded from 3.4.0 to 3.6.0. I have 2 new clients (RHEL6) that I'm implementing. Went through the build process and built out RPMs for RHEL6. Turned on GMOND and I'm not seeing either of the two systems reporting into the associated GMETAD. The Web Interface isn't updating with the new boxes. As I start going back through some of my past issues, I ran back across this where in 3.4.0 Grid of Grids was broken. And when I check the reported file and problem again I see the same old code (the break; at the end of the first switch block). Is this broken again in 3.6? or is this the correct code and I should be looking somewhere else for why my new RHEL6 clients aren't reporting to my GMETAD system? --Illydth -- LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. http://pubads.g.doubleclick.net/gampad/clk?id=58041151iu=/4140/ostg.clktrk ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- gpg: using PGP trust model pub 4096R/1EE38BD9 2013-01-06 [expires: 2018-01-06] Key fingerprint = 3EE9 550D D9D8 DB65 58C2 B58D CE78 EC6C 1EE3 8BD9 uid Nicholas Satterly (Debian Key) nfsatte...@gmail.com sub 4096R/23804EE9 2013-01-06 [expires: 2018-01-06] -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] grid of grids (ganglia-3.3.1)
Hi, Apologies if I make any mistakes of etiquette in this email, but when might the grid of grids issue be fixed, given there is a pull request for such? This is preventing a much desired upgrade of our ganglia systems at $WORK, and so I am keen to see this issue resolved. Regards, Michael. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] grid of grids (ganglia-3.3.1)
Hi, I've submitted a pull request to fix the gmetad problem mentioned below. See https://github.com/ganglia/monitor-core/pull/35 It's nothing fancy -- it just re-instates what was there before a patch broke it. Regards, Nick On Wed, Mar 28, 2012 at 4:42 PM, Arnau Bria listsar...@gmail.com wrote: On Wed, 28 Mar 2012 17:02:59 +0200 Alexander Karner wrote: Hi Arnau! Hi Alexander! Well, some weeks ago there was some kind of discussion about the setup that you want to use. I promise that I've googled for my problem and also searched inside mailing list. It seems that all Ganglia versions 3.1.7 contain a bug, that preventrs gmetad to collect data from other gmetad's. -- You should run your central gmetad system with Version 3.1.7, your remote gmetad's could be installed on any level that you prefer. Thanks, I've downgrade ganglia01's versions and now it works! many thanks! You'll find a more detailed bug report in the bugzilla area. Mit freundlichen Grüßen / Kind regards Alexander Karner Program Manager System Check Kundentag IBM Accredited Senior IT Specialist Global Technology Services Cheers, Arnau -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] grid of grids (ganglia-3.3.1)
Hi Arnau! Well, some weeks ago there was some kind of discussion about the setup that you want to use. It seems that all Ganglia versions 3.1.7 contain a bug, that preventrs gmetad to collect data from other gmetad's. -- You should run your central gmetad system with Version 3.1.7, your remote gmetad's could be installed on any level that you prefer. You'll find a more detailed bug report in the bugzilla area. Mit freundlichen Grüßen / Kind regards Alexander Karner Program Manager System Check Kundentag IBM Accredited Senior IT Specialist Global Technology Services Arnau Bria listsar...@gmail.com wrote on 28.03.2012 16:39:47: From: Arnau Bria listsar...@gmail.com To: ganglia-general@lists.sourceforge.net, Date: 28.03.2012 16:41 Subject: [Ganglia-general] grid of grids (ganglia-3.3.1) Hi all, I'm new to list, so let me first say hello to everyone. My name is Arnau Bria and I'm quite newbie when talking about ganglia. We've been working with ganglia for a long time, but we have never studied all its options. We use multicast and one gmetad that collects from several clusters. We are now planning a new fresh install (ganglia 3.3.1 on SL61 build from tarball) and we want to introduce 2 big changes: unicast and some grids (gmetads/gwebs) collected by one frontend. This first mail is about configuring grid of grid. I've not found any clear doc explaning how to configure it (I've found many about one gmetad reading from many clusters, but non about one gmetad collecting from other gmetads), so if someone knows one feel free to send me the link and stop reading at this point :-) If no one knows a link, let me explain my environment: ganglia01 - Grid of Grids collector. ganglia02 - it has it own grid (and cluster) and gmond is running here. I'd like ganglia01 to collect info from gangli02 gmetad's and to publish it into its gweb. My conf: ganglia01: gmetad.conf: # grep . /etc/ganglia/gmetad.conf|grep -v # data_source MyTest2 ganglia02.pic.es:8651 gridname PICGrid authority http://ganglia01.domain/gweb/; case_sensitive_hostnames 0 ganglia02: gmetad.conf: # grep . /etc/ganglia/gmetad.conf|grep -v # data_source Ganglia2 localhost:8649 gridname MyTest2 authority http://ganglia01.pic.es/gweb/; trusted_hosts 127.0.0.1 193.109.175.116 ganglia01.domain case_sensitive_hostnames 0 # gmond info: [...] cluster { name = cluster1-ganglia2 owner = unspecified latlong = unspecified url = unspecified } [...] ganglia02 has its own gweb and if I go to http://ganglia02/gweb I can see: MyTest2 Grid - cluster1-ganglia2 - ganglia02 (which is the only node belonging to that cluster). This is fine. But ganglia01 only shows: PICGrid Grid - So, it's not collectic any ganglis02 gmetad info. ganglia01 log says: poll() timeout from source 0 for [MyTest2] data source after 0 bytes read and ganglia02 : Got a malformed path request from 193.109.175.116 If I telnet from ganglia01 to ganglia02 at port 8651: @ganglia01 html]# telnet ganglia02.pic.es 8651 Trying 193.109.175.117... Connected to ganglia02.pic.es. Escape character is '^]'. ?xml version=1.0 encoding=ISO-8859-1 standalone=yes? !DOCTYPE GANGLIA_XML [ !ELEMENT GANGLIA_XML (GRID|CLUSTER|HOST)* [...] GANGLIA_XML VERSION=3.3.0 SOURCE=gmetad GRID NAME=MyTest2 AUTHORITY=http://ganglia01.domain/gweb/; LOCALTIME=1332945021 CLUSTER NAME=cluster1-ganglia2 LOCALTIME=1332945016 OWNER=unspecified LATLONG=unspecified URL=unspecified So seems that ganglia02's gmetad is publishing its info correctly, but ganglia01 not able to collect it. running gmetad in debug mode: # gmetad -d10 Going to run as user nobody Sources are ... Source: [MyTest2, step 15] has 1 sources 193.109.175.117 - ganglia02 IP xml listening on port 8651 interactive xml listening on port 8652 cleanup thread has been started Data thread 140307471931136 is monitoring [MyTest2] data source 193.109.175.117 [MyTest2] is a 2.5 or later data stream hash_create size = 50 hash-size is 53 Found a GRID, depth is now 1 Found a /GRID, depth is now 0 [MyTest2] is a 2.5 or later data stream hash_create size = 50 [repeat for rever] I see no error on syslog, so, could someone help me to understand what's wrong with my conf? TIA, Arnau -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure___
Re: [Ganglia-general] grid of grids (ganglia-3.3.1)
On Wed, 28 Mar 2012 17:02:59 +0200 Alexander Karner wrote: Hi Arnau! Hi Alexander! Well, some weeks ago there was some kind of discussion about the setup that you want to use. I promise that I've googled for my problem and also searched inside mailing list. It seems that all Ganglia versions 3.1.7 contain a bug, that preventrs gmetad to collect data from other gmetad's. -- You should run your central gmetad system with Version 3.1.7, your remote gmetad's could be installed on any level that you prefer. Thanks, I've downgrade ganglia01's versions and now it works! many thanks! You'll find a more detailed bug report in the bugzilla area. Mit freundlichen Grüßen / Kind regards Alexander Karner Program Manager System Check Kundentag IBM Accredited Senior IT Specialist Global Technology Services Cheers, Arnau -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general