Re: [Ganglia-general] gmond 3.1.2 becomes deaf in Solaris SPARC
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Rick Cobb: We had the same problem with gmond 3.0.4 on Solaris 10 / x86. As far as we were able to debug, it's a bug in Solaris itself, and particularly with the interaction between IGMPv3 support in the kernel and switches that only do IGMPv2. The only workarounds we were able to use were unicast, or restarting gmond quite often on the machines we had gmetad talking to. if this were the case, would a visible symptom be that 'snoop' running on the gmond hosts would not see any traffic from other systems? in our case, we can see the gmond traffic, the running gmond just ignores it for some reason. - river. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (HP-UX) iEYEARECAAYFAksClF4ACgkQIXd7fCuc5vIuAgCghTiqNd014JGwMqLth9GgpLjF xWwAnA5dWN8FfXQsEzUvd5KVG1krvt+u =DFpu -END PGP SIGNATURE- -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Monitoring
Hi everyone, Ok I got my Ganglia monitor up and working, and it was pulling results from the localhost. So I enable the hadoop-metrics.properties and made the appropriate changes so that it pointed at me ganglia box. I made a data_source in the gmetad.conf file, and attached the two test nodes to it. I restart gmond, gemtad and the ganglia-web for good measure. But I am not seeing any results, and I am not seeing my data source, it says unspecified. Any ideas? Thanks in advance. -John -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Monitoring
Ok. I just ran a 'gstat --all' And only one host comes up, just the localhost. So there is something missing. any ideas? -John On Nov 17, 2009, at 9:22 AM, John Martyniak wrote: Hi everyone, Ok I got my Ganglia monitor up and working, and it was pulling results from the localhost. So I enable the hadoop-metrics.properties and made the appropriate changes so that it pointed at me ganglia box. I made a data_source in the gmetad.conf file, and attached the two test nodes to it. I restart gmond, gemtad and the ganglia-web for good measure. But I am not seeing any results, and I am not seeing my data source, it says unspecified. Any ideas? Thanks in advance. -John -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] udp_recv_channel
So the udp_recv_channel in the gmond.conf file is as follows: udp_recv_channel { mcast_join = 239.2.11.71 bind = 239.2.11.71 port = 8649 } if I change that to the ip address of the monitoring master machine, I get an error that it can't join the cast or something like that. Any ideas? -John -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] udp_recv_channel
On Tue, 17 Nov 2009, John Martyniak wrote: It should pretty much work out of the box John. Does your network not allow multicasting? So the udp_recv_channel in the gmond.conf file is as follows: udp_recv_channel { mcast_join = 239.2.11.71 bind = 239.2.11.71 port = 8649 } if I change that to the ip address of the monitoring master machine, I get an error that it can't join the cast or something like that. Any ideas? -John Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |God must love stupid people. She keeps making Charlestown, MA., 02129 USA |them in such horrifyingly large numbers. Me -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] udp_recv_channel
It should, I don't restrict anything, and I have the firewalls turned off on those two machines. It is on a private network that I use NAT through my router to get to the outside world. But that shouldn't matter because all of the machine can get out to the internet. -John On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: It should pretty much work out of the box John. Does your network not allow multicasting? So the udp_recv_channel in the gmond.conf file is as follows: udp_recv_channel { mcast_join = 239.2.11.71 bind = 239.2.11.71 port = 8649 } if I change that to the ip address of the monitoring master machine, I get an error that it can't join the cast or something like that. Any ideas? -John Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |God must love stupid people. She keeps making Charlestown, MA., 02129 USA |them in such horrifyingly large numbers. Me -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] udp_recv_channel
On Tue, 17 Nov 2009, John Martyniak wrote: Are the monitored nodes on the same side as the monitoring node? If not you might have to explicitly turn on mulicasting in the router. Depends on the router. It should, I don't restrict anything, and I have the firewalls turned off on those two machines. It is on a private network that I use NAT through my router to get to the outside world. But that shouldn't matter because all of the machine can get out to the internet. -John On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: It should pretty much work out of the box John. Does your network not allow multicasting? So the udp_recv_channel in the gmond.conf file is as follows: udp_recv_channel { mcast_join = 239.2.11.71 bind = 239.2.11.71 port = 8649 } if I change that to the ip address of the monitoring master machine, I get an error that it can't join the cast or something like that. Any ideas? -John Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |God must love stupid people. She keeps making Charlestown, MA., 02129 USA |them in such horrifyingly large numbers. Me --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |Man's a kind of missing link Charlestown, MA., 02129 USA |fondly thinking he can think. Piet Hein --- -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] udp_recv_channel
Yes they are all in the same subnet, all attached to the same switch. monitor is: 10.1.1.25 the two devices are 10.1.1.128, 10.1.1.129 I tried the telnet test also: from each of the machines that are monitored, I ran telnet 10.1.1.25 8649, and received the XML file. -John On Nov 17, 2009, at 10:41 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: Are the monitored nodes on the same side as the monitoring node? If not you might have to explicitly turn on mulicasting in the router. Depends on the router. It should, I don't restrict anything, and I have the firewalls turned off on those two machines. It is on a private network that I use NAT through my router to get to the outside world. But that shouldn't matter because all of the machine can get out to the internet. -John On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: It should pretty much work out of the box John. Does your network not allow multicasting? So the udp_recv_channel in the gmond.conf file is as follows: udp_recv_channel { mcast_join = 239.2.11.71 bind = 239.2.11.71 port = 8649 } if I change that to the ip address of the monitoring master machine, I get an error that it can't join the cast or something like that. Any ideas? -John Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |God must love stupid people. She keeps making Charlestown, MA., 02129 USA |them in such horrifyingly large numbers. Me --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |Man's a kind of missing link Charlestown, MA., 02129 USA |fondly thinking he can think. Piet Hein --- -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] udp_recv_channel
Beginner quetion: how do I run gmetad in -d mode? I have been using /etc/rc.d/init.d/ gmetad start|stop|restart -John On Nov 17, 2009, at 11:07 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: And they all are configured with the same grid name? Another thing to try is to run gmetad in -d mode and see if its receiving anything. Yes they are all in the same subnet, all attached to the same switch. monitor is: 10.1.1.25 the two devices are 10.1.1.128, 10.1.1.129 I tried the telnet test also: from each of the machines that are monitored, I ran telnet 10.1.1.25 8649, and received the XML file. -John On Nov 17, 2009, at 10:41 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: Are the monitored nodes on the same side as the monitoring node? If not you might have to explicitly turn on mulicasting in the router. Depends on the router. It should, I don't restrict anything, and I have the firewalls turned off on those two machines. It is on a private network that I use NAT through my router to get to the outside world. But that shouldn't matter because all of the machine can get out to the internet. -John On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: It should pretty much work out of the box John. Does your network not allow multicasting? So the udp_recv_channel in the gmond.conf file is as follows: udp_recv_channel { mcast_join = 239.2.11.71 bind = 239.2.11.71 port = 8649 } if I change that to the ip address of the monitoring master machine, I get an error that it can't join the cast or something like that. Any ideas? -John Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |God must love stupid people. She keeps making Charlestown, MA., 02129 USA |them in such horrifyingly large numbers. Me --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |Man's a kind of missing link Charlestown, MA., 02129 USA |fondly thinking he can think. Piet Hein --- --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |I'm continually amazed by mankind's seemingly Charlestown, MA., 02129 USA |infinite capacity for stupidity.Me --- -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Short one.
Hi Chris: On Mon, Nov 16, 2009 at 11:37 AM, Chris Johnson john...@nmr.mgh.harvard.edu wrote: So I installed php-gd. Still just says Pie Chart though. Anything I should do? Any logs to look at? Have you tried re-starting apache? ;-) Cheers, Bernard -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia install instructions wiki link broken
Hi Brad: On Mon, Nov 16, 2009 at 1:06 PM, Brad Nicholes bnicho...@novell.com wrote: I think I have all of the wiki page links fixed up. Especially on the installation and configuration page. I also fixed up some links to the misc. documents about Ganglia and monitoring. If anyone discovers any other broken links on the wiki, please let me know. Thanks a lot for doing this -- appreciate it! Cheers, Bernard -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] udp_recv_channel
How do I set the grid name? Because these are hadoop machines so I used the following configuration parameters in my hadoop-metrics.properties files: dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext dfs.period=10 dfs.serve...@ganglia@:8649 mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext mapred.period=10 mapred.serve...@ganglia@:8649 jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext jvm.period=10 jvm.serve...@ganglia@:8649 rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext rpc.period=10 rpc.serve...@ganglia@:8649 And replaced the @GANGLIA@ with 10.1.1.25 I didn't install any ganglia stuff on each of the machines. I just ran Hadoop as I normally do, and configured the above. -John On Nov 17, 2009, at 11:07 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: And they all are configured with the same grid name? Another thing to try is to run gmetad in -d mode and see if its receiving anything. Yes they are all in the same subnet, all attached to the same switch. monitor is: 10.1.1.25 the two devices are 10.1.1.128, 10.1.1.129 I tried the telnet test also: from each of the machines that are monitored, I ran telnet 10.1.1.25 8649, and received the XML file. -John On Nov 17, 2009, at 10:41 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: Are the monitored nodes on the same side as the monitoring node? If not you might have to explicitly turn on mulicasting in the router. Depends on the router. It should, I don't restrict anything, and I have the firewalls turned off on those two machines. It is on a private network that I use NAT through my router to get to the outside world. But that shouldn't matter because all of the machine can get out to the internet. -John On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: It should pretty much work out of the box John. Does your network not allow multicasting? So the udp_recv_channel in the gmond.conf file is as follows: udp_recv_channel { mcast_join = 239.2.11.71 bind = 239.2.11.71 port = 8649 } if I change that to the ip address of the monitoring master machine, I get an error that it can't join the cast or something like that. Any ideas? -John Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |God must love stupid people. She keeps making Charlestown, MA., 02129 USA |them in such horrifyingly large numbers. Me --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |Man's a kind of missing link Charlestown, MA., 02129 USA |fondly thinking he can think. Piet Hein --- --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |I'm continually amazed by mankind's seemingly Charlestown, MA., 02129 USA |infinite capacity for stupidity.Me --- -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Ganglia cannot find a data source.
I too have been bangin my head on this for a few weeks. After much googling i cannot seem to find the answer, so i hope someone (developer maybe) can help. I was successfully using ganglia 2.5 and 3.0.x. At some point i upgraded to 3.1.x and things went sour. I've even tried to revert back to a known working condition to no avail. So here's my current setup. GMETAD 3.1.4 running under suse 11.1 ppc. Using a basic gmetad.conf file monitoring itself (localhost) for troubleshooting purposes: ---snip from /etc/gmetad.conf --- data_source my cluster localhost gpipnim01 data_source sap_app gpiptcpeap02 ---snip- XML on localhost seems fine. I can telnet to localhost 8469 and get proper results. FWIW : GANGLIA_XML VERSION=3.1.4 SOURCE=gmond RRD's are updating properly in /var/lib/ganglia/rrds/ gmond (on localhost) in debug mode is sending updates (obviously since RRD's are being created). gmond -m shows modules are loaded. Web frontend: When I hit the webpage i get Ganglia cannot find a data source. Is gmond running? No webpage or data is shown. Currently the web version is ganglia-web-3.1.0-1, but i've tried 3.1.4 and older with the same results. The debug output from gmetad shows: server_thread() received request /?filter=summary from 127.0.0.1 Found subtree / and filter=summary It seems that occasionally i can get the webpage to display briefly after initial startup of gmetad and gmond. PHP memory is set to 1024 in php.ini --snip from conf.php-- # Gmetad-webfrontend version. Used to check for updates. # include_once ./version.php; # # The name of the directory in ./templates which contains the # templates that you want to use. Templates are like a skin for the # site that can alter its look and feel. # $template_name = default; # # If you installed gmetad in a directory other than the default # make sure you change it here. # # Where gmetad stores the rrd archives. $gmetad_root = /var/lib/ganglia; $rrds = $gmetad_root/rrds; # Leave this alone if rrdtool is installed in $gmetad_root, # otherwise, change it if it is installed elsewhere (like /usr/bin) define(RRDTOOL, /usr/bin/rrdtool); # Location for modular-graph files. $graphdir='./graph.d'; # # If you want to grab data from a different ganglia source specify it here. # Although, it would be strange to alter the IP since the Round-Robin # databases need to be local to be read. # $ganglia_ip = 127.0.0.1; $ganglia_port = 8652; # Old-style names. $gmetad_ip = $ganglia_ip; $gmetad_port = $ganglia_port; --snip- I'd be happy to post add'l config info if it helps/requested. I was trying to not to post TMI. Thanks, Ryan -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia cannot find a data source.
On 11/17/2009 at 10:04 AM, in message b1eec58d0911170904r2f2613ads9244341a82b85...@mail.gmail.com, Ryan Robertson 89esp...@gmail.com wrote: I too have been bangin my head on this for a few weeks. After much googling i cannot seem to find the answer, so i hope someone (developer maybe) can help. I was successfully using ganglia 2.5 and 3.0.x. At some point i upgraded to 3.1.x and things went sour. I've even tried to revert back to a known working condition to no avail. So here's my current setup. GMETAD 3.1.4 running under suse 11.1 ppc. Using a basic gmetad.conf file monitoring itself (localhost) for troubleshooting purposes: ---snip from /etc/gmetad.conf --- data_source my cluster localhost gpipnim01 data_source sap_app gpiptcpeap02 ---snip- XML on localhost seems fine. I can telnet to localhost 8469 and get proper results. FWIW : GANGLIA_XML VERSION=3.1.4 SOURCE=gmond RRD's are updating properly in /var/lib/ganglia/rrds/ gmond (on localhost) in debug mode is sending updates (obviously since RRD's are being created). gmond -m shows modules are loaded. Web frontend: When I hit the webpage i get Ganglia cannot find a data source. Is gmond running? When you telnet to 8652 what do you get? Localhost 8649 is the output from gmond on localhost. Localhost 8652 is the interactive port from gmetad which is the port that the web frontend uses to get the metric data. Brad -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] udp_recv_channel
When I run it with gmetad --debug=5: I get the following: [r...@monitor ~]# gmetad --debug=5 Going to run as user nobody Sources are ... Source: [Weive cluster, step 15] has 2 sources 10.1.1.129 10.1.1.130 xml listening on port 8651 interactive xml listening on port 8652 cleanup thread has been started Data thread 3023907728 is monitoring [Weive cluster] data source 10.1.1.129 10.1.1.130 data_thread() got no answer from any [Weive cluster] datasource if I log into one of the boxes and try to telnet those ports (10.1.1.25:8651), I get connection refused, I checked to make sure that the firewalls are turned off. -john On Nov 17, 2009, at 11:07 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: And they all are configured with the same grid name? Another thing to try is to run gmetad in -d mode and see if its receiving anything. Yes they are all in the same subnet, all attached to the same switch. monitor is: 10.1.1.25 the two devices are 10.1.1.128, 10.1.1.129 I tried the telnet test also: from each of the machines that are monitored, I ran telnet 10.1.1.25 8649, and received the XML file. -John On Nov 17, 2009, at 10:41 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: Are the monitored nodes on the same side as the monitoring node? If not you might have to explicitly turn on mulicasting in the router. Depends on the router. It should, I don't restrict anything, and I have the firewalls turned off on those two machines. It is on a private network that I use NAT through my router to get to the outside world. But that shouldn't matter because all of the machine can get out to the internet. -John On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote: On Tue, 17 Nov 2009, John Martyniak wrote: It should pretty much work out of the box John. Does your network not allow multicasting? So the udp_recv_channel in the gmond.conf file is as follows: udp_recv_channel { mcast_join = 239.2.11.71 bind = 239.2.11.71 port = 8649 } if I change that to the ip address of the monitoring master machine, I get an error that it can't join the cast or something like that. Any ideas? -John Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |God must love stupid people. She keeps making Charlestown, MA., 02129 USA |them in such horrifyingly large numbers. Me --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |Man's a kind of missing link Charlestown, MA., 02129 USA |fondly thinking he can think. Piet Hein --- --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |I'm continually amazed by mankind's seemingly Charlestown, MA., 02129 USA |infinite capacity for stupidity.Me --- -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia 10th year anniversary get-together
Dear all: Just a quick update -- I've talked to Matt and a few others and it looks like late January would actually work best for everybody. So right now let's set the date tentatively to the weekend of Jan 18, 2010. Since I'm still gauging interest, for those of you who haven't responded yet, I'd really appreciate even a quick sure, i'm interested :-) Thanks a lot! Bernard On Wed, Nov 11, 2009 at 2:13 PM, Bernard Li bern...@vanhpc.org wrote: Dear friends of the Ganglia project: Can you believe that roughly 10 years ago, Matt Massie and others started the Ganglia project at UC Berkeley? Since then we have made 40+ releases, and our project files hosted at SourceForge.net have been downloaded over 299,208 times. Our user base has grown substantially over the years and it is still the de facto standard for monitoring HPC clusters, grids, and even cloud servers. To commemorate our achievements over the years, I would like to organize a dinner party in San Francisco, USA. Currently the dates being floated around are early December to late January, or later in 2010. If you are interested in attending (or sponsoring) this event, please either reply back to this thread and/or to me privately. Thanks again for all your support over the years, and let's work together towards a better Ganglia project for the next 10 years to come! Best Regards, Bernard - on behalf of the Ganglia Team -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Short one.
On Tue, 17 Nov 2009, Bernard Li wrote: DOH! Thanks. Hi Chris: On Mon, Nov 16, 2009 at 11:37 AM, Chris Johnson john...@nmr.mgh.harvard.edu wrote: So I installed php-gd. Still just says Pie Chart though. Anything I should do? Any logs to look at? Have you tried re-starting apache? ;-) Cheers, Bernard --- Chris Johnson |Internet: john...@nmr.mgh.harvard.edu Systems Administrator |Web: http://www.nmr.mgh.harvard.edu/~johnson NMR Center |Voice:617.726.0949 Mass. General Hospital |FAX: 617.726.7422 149 (2301) 13th Street |Prediction is difficult, especially when it comes Charlestown, MA., 02129 USA |to the future. Yogi Berra. - Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia cannot find a data source.
Ahh yes, i knew there was one other telnet snippet question. I am able to telnet to localhost 8652 and feed it /?filter=summary I get outputthe output scrolled off the screen, but you get the idea that it's returning... --snip- /METRICS METRICS NAME=swap_total SUM=2019320 NUM=1 TYPE=double UNITS=KB SLOPE=zero SOURCE=gmond EXTRA_DATA EXTRA_ELEMENT NAME=GROUP VAL=memory/ EXTRA_ELEMENT NAME=DESC VAL=Total amount of swap space displayed in KBs/ EXTRA_ELEMENT NAME=TITLE VAL=Swap Space Total/ /EXTRA_DATA /METRICS METRICS NAME=part_max_used SUM=40.2 NUM=1 TYPE=double UNITS=% SLOPE=both SOURCE=gmond EXTRA_DATA EXTRA_ELEMENT NAME=GROUP VAL=disk/ EXTRA_ELEMENT NAME=DESC VAL=Maximum percent used for all partitions/ EXTRA_ELEMENT NAME=TITLE VAL=Maximum Disk Space Used/ /EXTRA_DATA /METRICS /CLUSTER /GRID /GANGLIA_XML snip -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia cannot find a data source.
Sounds to me like it could be a file permissions problems then. Is your apache server able to access the rrd files and/or port 8652? On 11/17/2009 at 1:00 PM, in message 0016e64c2536e598710478969...@google.com, 89esp...@gmail.com wrote: Ahh yes, i knew there was one other telnet snippet question. I am able to telnet to localhost 8652 and feed it /?filter=summary I get outputthe output scrolled off the screen, but you get the idea that it's returning... --snip- /METRICS METRICS NAME=swap_total SUM=2019320 NUM=1 TYPE=double UNITS=KB SLOPE=zero SOURCE=gmond EXTRA_DATA EXTRA_ELEMENT NAME=GROUP VAL=memory/ EXTRA_ELEMENT NAME=DESC VAL=Total amount of swap space displayed in KBs/ EXTRA_ELEMENT NAME=TITLE VAL=Swap Space Total/ /EXTRA_DATA /METRICS METRICS NAME=part_max_used SUM=40.2 NUM=1 TYPE=double UNITS=% SLOPE=both SOURCE=gmond EXTRA_DATA EXTRA_ELEMENT NAME=GROUP VAL=disk/ EXTRA_ELEMENT NAME=DESC VAL=Maximum percent used for all partitions/ EXTRA_ELEMENT NAME=TITLE VAL=Maximum Disk Space Used/ /EXTRA_DATA /METRICS /CLUSTER /GRID /GANGLIA_XML snip -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] udp_recv_channel
I'm not sure if this is related to your issue, but it seems possibly related... Last summer, with Ganglia 3.1.0, I found that bind either does not work in a multicast _recv_channel, only in _send_channel ... or the other way 'round. I forget which it was, but it certainly did not work in one of those, on any of my gmond.conf's. When using multicast, I could only use bind in one sort of channel (either recv but not send, or send but not recv), and if I used it on the other kind of channel, it was silently not used. Unfortunately I'm at a different employer now so I can't check those configs. -- Cos -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Conditional statements in Ganglia Web templates
I was wondering if it is possible and if so how to add conditional statements in Ganglia Web templates. What I am after is that I have some custom consolidated reports like the ones from here http://vuksan.com/linux/ganglia/#Apache_Traffic_Stats Currently I modified the template to include the report in host view which is not ideal since some nodes will not have Apache metrics e.g. DB servers etc. Is it therefore possible to add a conditional statement that would check for existence of a file and based on that source a particular image. Thanks, Vladimir -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Multicast IP Address
So do the ip address need to be real ip addresses that are in the multi-cast IP? It is currently set to 239.2.11.71, which isn't a real ip address on my network, does it need to be? I tried changing the hadoop-metrics.properties to that value and it did not have any results. gmetad --debug=5 still could not connect. Any ideas would be helpful. thank you, -john -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] gmond 3.1.2 becomes deaf in Solaris SPARC
Yes. We would see the traffic on other machines, but we would not see multicast traffic coming into the machine we were using to aggregate metrics. Restarting gmond would get the traffic flowing back in. -- ReC On Nov 17, 2009, at 4:17 AM, River Tarnell wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Rick Cobb: We had the same problem with gmond 3.0.4 on Solaris 10 / x86. As far as we were able to debug, it's a bug in Solaris itself, and particularly with the interaction between IGMPv3 support in the kernel and switches that only do IGMPv2. The only workarounds we were able to use were unicast, or restarting gmond quite often on the machines we had gmetad talking to. if this were the case, would a visible symptom be that 'snoop' running on the gmond hosts would not see any traffic from other systems? in our case, we can see the gmond traffic, the running gmond just ignores it for some reason. - river. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (HP-UX) iEYEARECAAYFAksClF4ACgkQIXd7fCuc5vIuAgCghTiqNd014JGwMqLth9GgpLjF xWwAnA5dWN8FfXQsEzUvd5KVG1krvt+u =DFpu -END PGP SIGNATURE- -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia cannot find a data source.
rrd dir and subdirs are owned by nobody. ls -ld /var/lib/ganglia/rrds drwxr-xr-x 7 nobody nobody 4096 May 28 2008 /var/lib/ganglia/rrds ls -l /var/lib/ganglia/rrds drwxr-xr-x 7 nobody root 4096 Sep 28 15:36 595 drwxr-xr-x 2 nobody root 4096 Sep 23 10:04 __SummaryInfo__ drwxr-xr-x 33 nobody root 4096 Sep 23 09:44 gpi drwxr-xr-x 12 nobody root 4096 Aug 20 2008 sap_app drwxr-xr-x 5 nobody root 4096 Nov 3 10:13 unspecified Apache is running under user wwwrun: wwwrun 9649 23799 0 11:48 ? 00:00:00 /usr/sbin/httpd2 -f /etc/apache2/httpd.conf -k start I dont see any errors in the Apache logs. Chowning /var/lib/ganglia/rrds to wwwrun didnt yield and changesh. I would think that even if apache didnt have access to the rrd files it would still show other html from the web frontend. How does apache access port 8652? Is there another way to test that? -Ryan On Nov 17, 2009 2:17pm, Brad Nicholes bnicho...@novell.com wrote: Sounds to me like it could be a file permissions problems then. Is your apache server able to access the rrd files and/or port 8652? -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Nutch 0.19.2 and Ganglia 3.1.3
Has anybody else had any trouble running nutch 0.19.2 with Ganglia 3.1.3? I was surfing through Jira and it seems that there where some issues but they have been resolved. Any thoughts would be helpful. Thank you, -John John Martyniak President/CEO Before Dawn Solutions, Inc. 9457 S. University Blvd #266 Highlands Ranch, CO 80126 o: 877-499-1562 c: 303-522-1756 e: j...@beforedawnsoutions.com w: http://www.beforedawnsolutions.com -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Monitoring
try this command #gstat --all -i a_hostname_in_cluster Chifeng On Tue, Nov 17, 2009 at 11:02 PM, John Martyniak j...@beforedawnsolutions.com wrote: Ok. I just ran a 'gstat --all' And only one host comes up, just the localhost. So there is something missing. any ideas? -John On Nov 17, 2009, at 9:22 AM, John Martyniak wrote: Hi everyone, Ok I got my Ganglia monitor up and working, and it was pulling results from the localhost. So I enable the hadoop-metrics.properties and made the appropriate changes so that it pointed at me ganglia box. I made a data_source in the gmetad.conf file, and attached the two test nodes to it. I restart gmond, gemtad and the ganglia-web for good measure. But I am not seeing any results, and I am not seeing my data source, it says unspecified. Any ideas? Thanks in advance. -John -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- regards. chifeng -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general