Re: [Ganglia-general] gmond 3.1.2 becomes deaf in Solaris SPARC

2009-11-17 Thread River Tarnell
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Rick Cobb:
 We had the same problem with gmond 3.0.4 on Solaris 10 / x86.  As far  
 as we were able to debug, it's a bug in Solaris itself, and  
 particularly with the interaction between IGMPv3 support in the kernel  
 and switches that only do IGMPv2. The only workarounds we were able to  
 use were unicast, or restarting gmond quite often on the machines we  
 had gmetad talking to.

if this were the case, would a visible symptom be that 'snoop' running
on the gmond hosts would not see any traffic from other systems?  in our
case, we can see the gmond traffic, the running gmond just ignores it
for some reason.

- river.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (HP-UX)

iEYEARECAAYFAksClF4ACgkQIXd7fCuc5vIuAgCghTiqNd014JGwMqLth9GgpLjF
xWwAnA5dWN8FfXQsEzUvd5KVG1krvt+u
=DFpu
-END PGP SIGNATURE-

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Monitoring

2009-11-17 Thread John Martyniak

Hi everyone,

Ok I got my Ganglia monitor up and working, and it was pulling results  
from the localhost.

So I enable the hadoop-metrics.properties and made the appropriate  
changes so that it pointed at me ganglia box.

I made a data_source in the gmetad.conf file, and attached the two  
test nodes to it.

I restart gmond, gemtad and the ganglia-web for good measure.

But I am not seeing any results, and I am not seeing my data source,  
it says unspecified.  Any ideas?

Thanks in advance.

-John


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Monitoring

2009-11-17 Thread John Martyniak
Ok.

I just ran a 'gstat --all'

And only one host comes up, just the localhost.

So there is something missing.

any ideas?

-John

On Nov 17, 2009, at 9:22 AM, John Martyniak wrote:


 Hi everyone,

 Ok I got my Ganglia monitor up and working, and it was pulling  
 results from the localhost.

 So I enable the hadoop-metrics.properties and made the appropriate  
 changes so that it pointed at me ganglia box.

 I made a data_source in the gmetad.conf file, and attached the two  
 test nodes to it.

 I restart gmond, gemtad and the ganglia-web for good measure.

 But I am not seeing any results, and I am not seeing my data source,  
 it says unspecified.  Any ideas?

 Thanks in advance.

 -John



--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] udp_recv_channel

2009-11-17 Thread John Martyniak

So the udp_recv_channel in the gmond.conf file is as follows:

  udp_recv_channel {
mcast_join = 239.2.11.71
bind   = 239.2.11.71
port   = 8649
  }

if I change that to the ip address of the monitoring master machine, I  
get an error that it can't join the cast or something like that.


Any ideas?

-John

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] udp_recv_channel

2009-11-17 Thread Chris Johnson
On Tue, 17 Nov 2009, John Martyniak wrote:

  It should pretty much work out of the box John.  Does your network not
allow multicasting?

 So the udp_recv_channel in the gmond.conf file is as follows:

  udp_recv_channel {
mcast_join = 239.2.11.71
bind   = 239.2.11.71
port   = 8649
  }

 if I change that to the ip address of the monitoring master machine, I get an 
 error that it can't join the cast or something like that.

 Any ideas?

 -John



Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
Systems Administrator   |Web:  http://www.nmr.mgh.harvard.edu/~johnson
NMR Center  |Voice:617.726.0949
Mass. General Hospital  |FAX:  617.726.7422
149 (2301) 13th Street  |God must love stupid people.  She keeps making
Charlestown, MA., 02129 USA |them in such horrifyingly large numbers.  Me


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] udp_recv_channel

2009-11-17 Thread John Martyniak
It should, I don't restrict anything, and I have the firewalls turned  
off on those two machines.

It is on a private network that I use NAT through my router to get to  
the outside world.  But that shouldn't matter because all of the  
machine can get out to the internet.



-John

On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote:

 On Tue, 17 Nov 2009, John Martyniak wrote:

 It should pretty much work out of the box John.  Does your  
 network not
 allow multicasting?

 So the udp_recv_channel in the gmond.conf file is as follows:

 udp_recv_channel {
   mcast_join = 239.2.11.71
   bind   = 239.2.11.71
   port   = 8649
 }

 if I change that to the ip address of the monitoring master  
 machine, I get an error that it can't join the cast or something  
 like that.

 Any ideas?

 -John


 
 Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
 Systems Administrator   |Web:  http://www.nmr.mgh.harvard.edu/~johnson
 NMR Center  |Voice:617.726.0949
 Mass. General Hospital  |FAX:  617.726.7422
 149 (2301) 13th Street  |God must love stupid people.  She keeps  
 making
 Charlestown, MA., 02129 USA |them in such horrifyingly large  
 numbers.  Me
 


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] udp_recv_channel

2009-11-17 Thread Chris Johnson
On Tue, 17 Nov 2009, John Martyniak wrote:

  Are the monitored nodes on the same side as the monitoring node?  If
not you might have to explicitly turn on mulicasting in the router.
Depends on the router.

 It should, I don't restrict anything, and I have the firewalls turned off on 
 those two machines.

 It is on a private network that I use NAT through my router to get to the 
 outside world.  But that shouldn't matter because all of the machine can get 
 out to the internet.



 -John

 On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote:

 On Tue, 17 Nov 2009, John Martyniak wrote:

 It should pretty much work out of the box John.  Does your network not
 allow multicasting?
 
  So the udp_recv_channel in the gmond.conf file is as follows:
  
  udp_recv_channel {
mcast_join = 239.2.11.71
bind   = 239.2.11.71
port   = 8649
 }
  
  if I change that to the ip address of the monitoring master machine, I 
  get an error that it can't join the cast or something like that.
  
  Any ideas?
  
  -John
  
 
 
 Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
 Systems Administrator   |Web: 
 http://www.nmr.mgh.harvard.edu/~johnson
 NMR Center  |Voice:617.726.0949
 Mass. General Hospital  |FAX:  617.726.7422
 149 (2301) 13th Street  |God must love stupid people.  She keeps making
 Charlestown, MA., 02129 USA |them in such horrifyingly large numbers.  Me
 



---
Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
Systems Administrator   |Web:  http://www.nmr.mgh.harvard.edu/~johnson
NMR Center  |Voice:617.726.0949
Mass. General Hospital  |FAX:  617.726.7422
149 (2301) 13th Street  |Man's a kind of missing link
Charlestown, MA., 02129 USA |fondly thinking he can think.  Piet Hein
---

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] udp_recv_channel

2009-11-17 Thread John Martyniak
Yes they are all in the same subnet, all attached to the same switch.

monitor is: 10.1.1.25

the two devices are 10.1.1.128, 10.1.1.129

I tried the telnet test also:  from each of the machines that are  
monitored, I ran telnet 10.1.1.25 8649, and received the XML file.

-John

On Nov 17, 2009, at 10:41 AM, Chris Johnson wrote:

 On Tue, 17 Nov 2009, John Martyniak wrote:

 Are the monitored nodes on the same side as the monitoring  
 node?  If
 not you might have to explicitly turn on mulicasting in the router.
 Depends on the router.

 It should, I don't restrict anything, and I have the firewalls  
 turned off on those two machines.

 It is on a private network that I use NAT through my router to get  
 to the outside world.  But that shouldn't matter because all of the  
 machine can get out to the internet.



 -John

 On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote:

 On Tue, 17 Nov 2009, John Martyniak wrote:

It should pretty much work out of the box John.  Does your  
 network not
 allow multicasting?
  So the udp_recv_channel in the gmond.conf file is as follows:
   udp_recv_channel {
mcast_join = 239.2.11.71
bind   = 239.2.11.71
port   = 8649
 }
   if I change that to the ip address of the monitoring master  
 machine, I  get an error that it can't join the cast or something  
 like that.
   Any ideas?
   -John
   
 
 Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
 Systems Administrator   |Web: http://www.nmr.mgh.harvard.edu/~johnson
 NMR Center  |Voice:617.726.0949
 Mass. General Hospital  |FAX:  617.726.7422
 149 (2301) 13th Street  |God must love stupid people.  She  
 keeps making
 Charlestown, MA., 02129 USA |them in such horrifyingly large  
 numbers.  Me
 



 ---
 Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
 Systems Administrator   |Web:  http://www.nmr.mgh.harvard.edu/~johnson
 NMR Center  |Voice:617.726.0949
 Mass. General Hospital  |FAX:  617.726.7422
 149 (2301) 13th Street  |Man's a kind of missing link
 Charlestown, MA., 02129 USA |fondly thinking he can think.  Piet Hein
 ---


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] udp_recv_channel

2009-11-17 Thread John Martyniak
Beginner quetion:

how do I run gmetad in -d mode?  I have been using /etc/rc.d/init.d/ 
gmetad start|stop|restart

-John

On Nov 17, 2009, at 11:07 AM, Chris Johnson wrote:

 On Tue, 17 Nov 2009, John Martyniak wrote:

 And they all are configured with the same grid name?  Another  
 thing to
 try is to run gmetad in -d mode and see if its receiving anything.

 Yes they are all in the same subnet, all attached to the same switch.

 monitor is: 10.1.1.25

 the two devices are 10.1.1.128, 10.1.1.129

 I tried the telnet test also:  from each of the machines that are  
 monitored, I ran telnet 10.1.1.25 8649, and received the XML file.

 -John

 On Nov 17, 2009, at 10:41 AM, Chris Johnson wrote:

 On Tue, 17 Nov 2009, John Martyniak wrote:

Are the monitored nodes on the same side as the monitoring  
 node?  If
 not you might have to explicitly turn on mulicasting in the router.
 Depends on the router.
  It should, I don't restrict anything, and I have the firewalls  
 turned off  on those two machines.
   It is on a private network that I use NAT through my router to  
 get to the  outside world.  But that shouldn't matter because all  
 of the machine can  get out to the internet.
 -John
   On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote:
On Tue, 17 Nov 2009, John Martyniak wrote:
It should pretty much work out of the box John.  Does  
 your network   not
   allow multicasting?
 So the udp_recv_channel in the gmond.conf file is as follows:
  udp_recv_channel {
   mcast_join = 239.2.11.71
   bind   = 239.2.11.71
   port   = 8649
 }
  if I change that to the ip address of the monitoring  
 master  machine, I  get an error that it can't join the  
 cast or something  like that.
  Any ideas?
  -John
  
 
   Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
   Systems Administrator   |Web:   
   http://www.nmr.mgh.harvard.edu/~johnson
   NMR Center  |Voice:617.726.0949
   Mass. General Hospital  |FAX:  617.726.7422
   149 (2301) 13th Street  |God must love stupid people.  She  
 keeps   making
   Charlestown, MA., 02129 USA |them in such horrifyingly large  
 numbers.   Me

 

 ---
 Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
 Systems Administrator   |Web: http://www.nmr.mgh.harvard.edu/~johnson
 NMR Center  |Voice:617.726.0949
 Mass. General Hospital  |FAX:  617.726.7422
 149 (2301) 13th Street  |Man's a kind of missing link
 Charlestown, MA., 02129 USA |fondly thinking he can think.  Piet  
 Hein
 ---



 ---
 Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
 Systems Administrator   |Web:  http://www.nmr.mgh.harvard.edu/~johnson
 NMR Center  |Voice:617.726.0949
 Mass. General Hospital  |FAX:  617.726.7422
 149 (2301) 13th Street  |I'm continually amazed by mankind's  
 seemingly
 Charlestown, MA., 02129 USA |infinite capacity for stupidity.Me
 ---


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Short one.

2009-11-17 Thread Bernard Li
Hi Chris:

On Mon, Nov 16, 2009 at 11:37 AM, Chris Johnson
john...@nmr.mgh.harvard.edu wrote:

     So I installed php-gd.  Still just says Pie Chart though.
 Anything I should do?  Any logs to look at?

Have you tried re-starting apache? ;-)

Cheers,

Bernard

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia install instructions wiki link broken

2009-11-17 Thread Bernard Li
Hi Brad:

On Mon, Nov 16, 2009 at 1:06 PM, Brad Nicholes bnicho...@novell.com wrote:

 I think I have all of the wiki page links fixed up.  Especially on the 
 installation and configuration page.  I also fixed up some links to the misc. 
 documents about Ganglia and monitoring.  If anyone discovers any other broken 
 links on the wiki, please let me know.

Thanks a lot for doing this -- appreciate it!

Cheers,

Bernard

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] udp_recv_channel

2009-11-17 Thread John Martyniak

How do I set the grid name?

Because these are hadoop machines so I used the following  
configuration parameters in my hadoop-metrics.properties files:

dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
dfs.period=10
dfs.serve...@ganglia@:8649

mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
mapred.period=10
mapred.serve...@ganglia@:8649

jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
jvm.period=10
jvm.serve...@ganglia@:8649

rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
rpc.period=10
rpc.serve...@ganglia@:8649
And replaced the @GANGLIA@ with 10.1.1.25

I didn't install any ganglia stuff on each of the machines.  I just  
ran Hadoop as I normally do, and configured the above.


-John

On Nov 17, 2009, at 11:07 AM, Chris Johnson wrote:


On Tue, 17 Nov 2009, John Martyniak wrote:

And they all are configured with the same grid name?  Another  
thing to

try is to run gmetad in -d mode and see if its receiving anything.


Yes they are all in the same subnet, all attached to the same switch.

monitor is: 10.1.1.25

the two devices are 10.1.1.128, 10.1.1.129

I tried the telnet test also:  from each of the machines that are  
monitored, I ran telnet 10.1.1.25 8649, and received the XML file.


-John

On Nov 17, 2009, at 10:41 AM, Chris Johnson wrote:


On Tue, 17 Nov 2009, John Martyniak wrote:

   Are the monitored nodes on the same side as the monitoring  
node?  If

not you might have to explicitly turn on mulicasting in the router.
Depends on the router.
 It should, I don't restrict anything, and I have the firewalls  
turned off  on those two machines.
  It is on a private network that I use NAT through my router to  
get to the  outside world.  But that shouldn't matter because all  
of the machine can  get out to the internet.

-John
  On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote:
   On Tue, 17 Nov 2009, John Martyniak wrote:
   It should pretty much work out of the box John.  Does  
your network   not

  allow multicasting?
So the udp_recv_channel in the gmond.conf file is as follows:
 udp_recv_channel {
  mcast_join = 239.2.11.71
  bind   = 239.2.11.71
  port   = 8649

}
 if I change that to the ip address of the monitoring  
master  machine, I  get an error that it can't join the  
cast or something  like that.

 Any ideas?
 -John
 


  Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
  Systems Administrator   |Web:   
http://www.nmr.mgh.harvard.edu/~johnson
  NMR Center  |Voice:617.726.0949
  Mass. General Hospital  |FAX:  617.726.7422
  149 (2301) 13th Street  |God must love stupid people.  She  
keeps   making
  Charlestown, MA., 02129 USA |them in such horrifyingly large  
numbers.   Me
   

   
---

Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
Systems Administrator   |Web: http://www.nmr.mgh.harvard.edu/~johnson
NMR Center  |Voice:617.726.0949
Mass. General Hospital  |FAX:  617.726.7422
149 (2301) 13th Street  |Man's a kind of missing link
Charlestown, MA., 02129 USA |fondly thinking he can think.  Piet  
Hein

---





---
Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
Systems Administrator   |Web:  http://www.nmr.mgh.harvard.edu/~johnson
NMR Center  |Voice:617.726.0949
Mass. General Hospital  |FAX:  617.726.7422
149 (2301) 13th Street  |I'm continually amazed by mankind's  
seemingly

Charlestown, MA., 02129 USA |infinite capacity for stupidity.Me
---


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Ganglia cannot find a data source.

2009-11-17 Thread Ryan Robertson
I too have been bangin my head on this for a few weeks.  After much googling
i cannot seem to find the answer, so i hope someone (developer maybe) can
help.


I was successfully using ganglia 2.5 and 3.0.x.  At some point i upgraded to
3.1.x and things went sour.  I've even tried to revert back to a known
working condition to no avail.  So here's my current setup.

GMETAD 3.1.4 running under suse 11.1 ppc.  Using a basic gmetad.conf file
monitoring itself (localhost) for troubleshooting purposes:
---snip from /etc/gmetad.conf ---
data_source my cluster localhost gpipnim01
data_source sap_app gpiptcpeap02
---snip-
XML on localhost seems fine.  I can telnet to localhost 8469 and get proper
results.  FWIW :
 GANGLIA_XML VERSION=3.1.4 SOURCE=gmond

RRD's are updating properly in /var/lib/ganglia/rrds/

gmond (on localhost) in debug mode is sending updates (obviously since RRD's
are being created).  gmond -m shows modules are loaded.

Web frontend:
When I hit the webpage i get 
Ganglia cannot find a data source. Is gmond running?

No webpage or data is shown. Currently the web version is
ganglia-web-3.1.0-1, but i've tried 3.1.4 and older with the same results.
The debug output from gmetad shows:

server_thread() received request /?filter=summary from 127.0.0.1
Found subtree / and filter=summary

It seems that occasionally i can get the webpage to display briefly after
initial startup of gmetad and gmond.

PHP memory is set to 1024 in php.ini
--snip from conf.php--
# Gmetad-webfrontend version. Used to check for updates.
#
include_once ./version.php;

#
# The name of the directory in ./templates which contains the
# templates that you want to use. Templates are like a skin for the
# site that can alter its look and feel.
#
$template_name = default;

#
# If you installed gmetad in a directory other than the default
# make sure you change it here.
#

# Where gmetad stores the rrd archives.
$gmetad_root = /var/lib/ganglia;
$rrds = $gmetad_root/rrds;

# Leave this alone if rrdtool is installed in $gmetad_root,
# otherwise, change it if it is installed elsewhere (like /usr/bin)
define(RRDTOOL, /usr/bin/rrdtool);

# Location for modular-graph files.
$graphdir='./graph.d';
#
# If you want to grab data from a different ganglia source specify it here.
# Although, it would be strange to alter the IP since the Round-Robin
# databases need to be local to be read.
#
$ganglia_ip = 127.0.0.1;
$ganglia_port = 8652;

# Old-style names.
$gmetad_ip = $ganglia_ip;
$gmetad_port = $ganglia_port;

--snip-
I'd be happy to post add'l config info if it helps/requested.  I was trying
to not to post TMI.

Thanks,
Ryan
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia cannot find a data source.

2009-11-17 Thread Brad Nicholes
 On 11/17/2009 at 10:04 AM, in message
b1eec58d0911170904r2f2613ads9244341a82b85...@mail.gmail.com, Ryan Robertson
89esp...@gmail.com wrote:
 I too have been bangin my head on this for a few weeks.  After much googling
 i cannot seem to find the answer, so i hope someone (developer maybe) can
 help.
 
 
 I was successfully using ganglia 2.5 and 3.0.x.  At some point i upgraded to
 3.1.x and things went sour.  I've even tried to revert back to a known
 working condition to no avail.  So here's my current setup.
 
 GMETAD 3.1.4 running under suse 11.1 ppc.  Using a basic gmetad.conf file
 monitoring itself (localhost) for troubleshooting purposes:
 ---snip from /etc/gmetad.conf ---
 data_source my cluster localhost gpipnim01
 data_source sap_app gpiptcpeap02
 ---snip-
 XML on localhost seems fine.  I can telnet to localhost 8469 and get proper
 results.  FWIW :
  GANGLIA_XML VERSION=3.1.4 SOURCE=gmond
 
 RRD's are updating properly in /var/lib/ganglia/rrds/
 
 gmond (on localhost) in debug mode is sending updates (obviously since RRD's
 are being created).  gmond -m shows modules are loaded.
 
 Web frontend:
 When I hit the webpage i get 
 Ganglia cannot find a data source. Is gmond running?
 

When you telnet to 8652 what do you get?  Localhost 8649 is the output from 
gmond on localhost.  Localhost 8652 is the interactive port from gmetad which 
is the port that the web frontend uses to get the metric data.

Brad


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] udp_recv_channel

2009-11-17 Thread John Martyniak
When I run it with gmetad --debug=5:

I get the following:
[r...@monitor ~]# gmetad --debug=5
Going to run as user nobody
Sources are ...
Source: [Weive cluster, step 15] has 2 sources
10.1.1.129
10.1.1.130
xml listening on port 8651
interactive xml listening on port 8652
cleanup thread has been started
Data thread 3023907728 is monitoring [Weive cluster] data source
10.1.1.129
10.1.1.130
data_thread() got no answer from any [Weive cluster] datasource


if I log into one of the boxes and try to telnet those ports  
(10.1.1.25:8651), I get connection refused, I checked to make sure  
that the firewalls are turned off.

-john

On Nov 17, 2009, at 11:07 AM, Chris Johnson wrote:

 On Tue, 17 Nov 2009, John Martyniak wrote:

 And they all are configured with the same grid name?  Another  
 thing to
 try is to run gmetad in -d mode and see if its receiving anything.

 Yes they are all in the same subnet, all attached to the same switch.

 monitor is: 10.1.1.25

 the two devices are 10.1.1.128, 10.1.1.129

 I tried the telnet test also:  from each of the machines that are  
 monitored, I ran telnet 10.1.1.25 8649, and received the XML file.

 -John

 On Nov 17, 2009, at 10:41 AM, Chris Johnson wrote:

 On Tue, 17 Nov 2009, John Martyniak wrote:

Are the monitored nodes on the same side as the monitoring  
 node?  If
 not you might have to explicitly turn on mulicasting in the router.
 Depends on the router.
  It should, I don't restrict anything, and I have the firewalls  
 turned off  on those two machines.
   It is on a private network that I use NAT through my router to  
 get to the  outside world.  But that shouldn't matter because all  
 of the machine can  get out to the internet.
 -John
   On Nov 17, 2009, at 10:17 AM, Chris Johnson wrote:
On Tue, 17 Nov 2009, John Martyniak wrote:
It should pretty much work out of the box John.  Does  
 your network   not
   allow multicasting?
 So the udp_recv_channel in the gmond.conf file is as follows:
  udp_recv_channel {
   mcast_join = 239.2.11.71
   bind   = 239.2.11.71
   port   = 8649
 }
  if I change that to the ip address of the monitoring  
 master  machine, I  get an error that it can't join the  
 cast or something  like that.
  Any ideas?
  -John
  
 
   Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
   Systems Administrator   |Web:   
   http://www.nmr.mgh.harvard.edu/~johnson
   NMR Center  |Voice:617.726.0949
   Mass. General Hospital  |FAX:  617.726.7422
   149 (2301) 13th Street  |God must love stupid people.  She  
 keeps   making
   Charlestown, MA., 02129 USA |them in such horrifyingly large  
 numbers.   Me

 

 ---
 Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
 Systems Administrator   |Web: http://www.nmr.mgh.harvard.edu/~johnson
 NMR Center  |Voice:617.726.0949
 Mass. General Hospital  |FAX:  617.726.7422
 149 (2301) 13th Street  |Man's a kind of missing link
 Charlestown, MA., 02129 USA |fondly thinking he can think.  Piet  
 Hein
 ---



 ---
 Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
 Systems Administrator   |Web:  http://www.nmr.mgh.harvard.edu/~johnson
 NMR Center  |Voice:617.726.0949
 Mass. General Hospital  |FAX:  617.726.7422
 149 (2301) 13th Street  |I'm continually amazed by mankind's  
 seemingly
 Charlestown, MA., 02129 USA |infinite capacity for stupidity.Me
 ---


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia 10th year anniversary get-together

2009-11-17 Thread Bernard Li
Dear all:

Just a quick update -- I've talked to Matt and a few others and it
looks like late January would actually work best for everybody.  So
right now let's set the date tentatively to the weekend of Jan 18,
2010.

Since I'm still gauging interest, for those of you who haven't
responded yet, I'd really appreciate even a quick sure, i'm
interested :-)

Thanks a lot!

Bernard

On Wed, Nov 11, 2009 at 2:13 PM, Bernard Li bern...@vanhpc.org wrote:
 Dear friends of the Ganglia project:

 Can you believe that roughly 10 years ago, Matt Massie and others
 started the Ganglia project at UC Berkeley?  Since then we have made
 40+ releases, and our project files hosted at SourceForge.net have
 been downloaded over 299,208 times.

 Our user base has grown substantially over the years and it is still
 the de facto standard for monitoring HPC clusters, grids, and even
 cloud servers.

 To commemorate our achievements over the years, I would like to
 organize a dinner party in San Francisco, USA.  Currently the dates
 being floated around are early December to late January, or later in
 2010.

 If you are interested in attending (or sponsoring) this event, please
 either reply back to this thread and/or to me privately.

 Thanks again for all your support over the years, and let's work
 together towards a better Ganglia project for the next 10 years to
 come!

 Best Regards,

 Bernard
 - on behalf of the Ganglia Team


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Short one.

2009-11-17 Thread Chris Johnson

On Tue, 17 Nov 2009, Bernard Li wrote:

 DOH!  Thanks.


Hi Chris:

On Mon, Nov 16, 2009 at 11:37 AM, Chris Johnson
john...@nmr.mgh.harvard.edu wrote:


    So I installed php-gd.  Still just says Pie Chart though.
Anything I should do?  Any logs to look at?


Have you tried re-starting apache? ;-)

Cheers,

Bernard





---
Chris Johnson   |Internet: john...@nmr.mgh.harvard.edu
Systems Administrator   |Web:  http://www.nmr.mgh.harvard.edu/~johnson
NMR Center  |Voice:617.726.0949
Mass. General Hospital  |FAX:  617.726.7422
149 (2301) 13th Street  |Prediction is difficult, especially when it comes
Charlestown, MA., 02129 USA |to the future.  Yogi Berra.
-
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia cannot find a data source.

2009-11-17 Thread 89esprit
Ahh yes, i knew there was one other telnet snippet question. I am able to  
telnet to localhost 8652 and feed it

/?filter=summary

I get outputthe output scrolled off the screen, but you get the idea  
that it's returning...


--snip-
/METRICS
METRICS NAME=swap_total SUM=2019320 NUM=1 TYPE=double UNITS=KB  
SLOPE=zero SOURCE=gmond

EXTRA_DATA
EXTRA_ELEMENT NAME=GROUP VAL=memory/
EXTRA_ELEMENT NAME=DESC VAL=Total amount of swap space displayed in  
KBs/

EXTRA_ELEMENT NAME=TITLE VAL=Swap Space Total/
/EXTRA_DATA
/METRICS
METRICS NAME=part_max_used SUM=40.2 NUM=1 TYPE=double UNITS=%  
SLOPE=both SOURCE=gmond

EXTRA_DATA
EXTRA_ELEMENT NAME=GROUP VAL=disk/
EXTRA_ELEMENT NAME=DESC VAL=Maximum percent used for all partitions/
EXTRA_ELEMENT NAME=TITLE VAL=Maximum Disk Space Used/
/EXTRA_DATA
/METRICS
/CLUSTER
/GRID
/GANGLIA_XML
snip
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia cannot find a data source.

2009-11-17 Thread Brad Nicholes
Sounds to me like it could be a file permissions problems then.  Is your apache 
server able to access the rrd files and/or port 8652?



 On 11/17/2009 at  1:00 PM, in message
0016e64c2536e598710478969...@google.com, 89esp...@gmail.com wrote: 
 Ahh yes, i knew there was one other telnet snippet question. I am able to  
 telnet to localhost 8652 and feed it
 /?filter=summary
 
 I get outputthe output scrolled off the screen, but you get the idea  
 that it's returning...
 
 --snip-
 /METRICS
 METRICS NAME=swap_total SUM=2019320 NUM=1 TYPE=double UNITS=KB  
 SLOPE=zero SOURCE=gmond
 EXTRA_DATA
 EXTRA_ELEMENT NAME=GROUP VAL=memory/
 EXTRA_ELEMENT NAME=DESC VAL=Total amount of swap space displayed in  
 KBs/
 EXTRA_ELEMENT NAME=TITLE VAL=Swap Space Total/
 /EXTRA_DATA
 /METRICS
 METRICS NAME=part_max_used SUM=40.2 NUM=1 TYPE=double UNITS=%  
 SLOPE=both SOURCE=gmond
 EXTRA_DATA
 EXTRA_ELEMENT NAME=GROUP VAL=disk/
 EXTRA_ELEMENT NAME=DESC VAL=Maximum percent used for all partitions/
 EXTRA_ELEMENT NAME=TITLE VAL=Maximum Disk Space Used/
 /EXTRA_DATA
 /METRICS
 /CLUSTER
 /GRID
 /GANGLIA_XML
 snip




--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] udp_recv_channel

2009-11-17 Thread Ofer Inbar
I'm not sure if this is related to your issue, but it seems possibly
related...  Last summer, with Ganglia 3.1.0, I found that bind
either does not work in a multicast _recv_channel, only in
_send_channel ... or the other way 'round.  I forget which it was, but
it certainly did not work in one of those, on any of my gmond.conf's.
When using multicast, I could only use bind in one sort of channel
(either recv but not send, or send but not recv), and if I used it on
the other kind of channel, it was silently not used.  Unfortunately
I'm at a different employer now so I can't check those configs.
  -- Cos

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Conditional statements in Ganglia Web templates

2009-11-17 Thread Vladimir Vuksan
I was wondering if it is possible and if so how to add conditional 
statements in Ganglia Web templates. What I am after is that I have some 
custom consolidated reports like the ones from here

http://vuksan.com/linux/ganglia/#Apache_Traffic_Stats

Currently I modified the template to include the report in host view which 
is not ideal since some nodes will not have Apache metrics e.g. DB servers 
etc. Is it therefore possible to add a conditional statement that would 
check for existence of a file and based on that source a particular image.

Thanks,
Vladimir

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Multicast IP Address

2009-11-17 Thread John Martyniak
So do the ip address need to be real ip addresses that are in the  
multi-cast IP?  It is currently set to 239.2.11.71, which isn't a real  
ip address on my network, does it need to be?

I tried changing the hadoop-metrics.properties to that value and it  
did not have any results.  gmetad --debug=5 still could not connect.

Any ideas would be helpful.

thank you,

-john

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] gmond 3.1.2 becomes deaf in Solaris SPARC

2009-11-17 Thread Rick Cobb
Yes. We would see the traffic on other machines, but we would not see  
multicast traffic coming into the machine we were using to aggregate  
metrics.  Restarting gmond would get the traffic flowing back in.

-- ReC
On Nov 17, 2009, at 4:17 AM, River Tarnell wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Rick Cobb:
 We had the same problem with gmond 3.0.4 on Solaris 10 / x86.  As far
 as we were able to debug, it's a bug in Solaris itself, and
 particularly with the interaction between IGMPv3 support in the  
 kernel
 and switches that only do IGMPv2. The only workarounds we were able  
 to
 use were unicast, or restarting gmond quite often on the machines we
 had gmetad talking to.

 if this were the case, would a visible symptom be that 'snoop' running
 on the gmond hosts would not see any traffic from other systems?  in  
 our
 case, we can see the gmond traffic, the running gmond just ignores it
 for some reason.

   - river.
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.9 (HP-UX)

 iEYEARECAAYFAksClF4ACgkQIXd7fCuc5vIuAgCghTiqNd014JGwMqLth9GgpLjF
 xWwAnA5dWN8FfXQsEzUvd5KVG1krvt+u
 =DFpu
 -END PGP SIGNATURE-


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia cannot find a data source.

2009-11-17 Thread 89esprit

rrd dir and subdirs are owned by nobody.
ls -ld /var/lib/ganglia/rrds

drwxr-xr-x 7 nobody nobody 4096 May 28 2008 /var/lib/ganglia/rrds

ls -l /var/lib/ganglia/rrds
drwxr-xr-x 7 nobody root 4096 Sep 28 15:36 595
drwxr-xr-x 2 nobody root 4096 Sep 23 10:04 __SummaryInfo__
drwxr-xr-x 33 nobody root 4096 Sep 23 09:44 gpi
drwxr-xr-x 12 nobody root 4096 Aug 20 2008 sap_app
drwxr-xr-x 5 nobody root 4096 Nov 3 10:13 unspecified


Apache is running under user wwwrun:

wwwrun 9649 23799 0 11:48 ? 00:00:00 /usr/sbin/httpd2 -f  
/etc/apache2/httpd.conf -k start


I dont see any errors in the Apache logs.

Chowning /var/lib/ganglia/rrds to wwwrun didnt yield and changesh.  
I would think that even if apache didnt have access to the rrd files it  
would still show other html from the web frontend. How does apache access  
port 8652? Is there another way to test that?


-Ryan



On Nov 17, 2009 2:17pm, Brad Nicholes bnicho...@novell.com wrote:
Sounds to me like it could be a file permissions problems then. Is your  
apache server able to access the rrd files and/or port 8652?
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Nutch 0.19.2 and Ganglia 3.1.3

2009-11-17 Thread John Martyniak
Has anybody else had any trouble running nutch 0.19.2 with Ganglia  
3.1.3?

I was surfing through Jira and it seems that there where some issues  
but they have been resolved.

Any thoughts would be helpful.

Thank you,

-John



John Martyniak
President/CEO
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: j...@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Monitoring

2009-11-17 Thread chifeng
try this command
#gstat --all -i a_hostname_in_cluster

Chifeng

On Tue, Nov 17, 2009 at 11:02 PM, John Martyniak 
j...@beforedawnsolutions.com wrote:

 Ok.

 I just ran a 'gstat --all'

 And only one host comes up, just the localhost.

 So there is something missing.

 any ideas?

 -John

 On Nov 17, 2009, at 9:22 AM, John Martyniak wrote:

 
  Hi everyone,
 
  Ok I got my Ganglia monitor up and working, and it was pulling
  results from the localhost.
 
  So I enable the hadoop-metrics.properties and made the appropriate
  changes so that it pointed at me ganglia box.
 
  I made a data_source in the gmetad.conf file, and attached the two
  test nodes to it.
 
  I restart gmond, gemtad and the ganglia-web for good measure.
 
  But I am not seeing any results, and I am not seeing my data source,
  it says unspecified.  Any ideas?
 
  Thanks in advance.
 
  -John
 



 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus
 on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general




-- 
regards.
chifeng
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general