Re: [Ganglia-general] newbie install of 3.1.7

2010-06-26 Thread Carlo Marcelo Arenas Belon
On Tue, Jun 22, 2010 at 11:16:54AM -0700, Deb Heller-Evans wrote:
 
 In our set up, I am configuring gmond for unicast communication, and have set 
 up gmond.conf on the nodes to have the following:

  52 udp_recv_channel {
 --  53   host = 198.129.76.131
  54   port = 8649
  55 }
  56 
 
 BUT, when starting gmond on the node, gmond complains:
 
 [108#] service gmond start
 Starting GANGLIA gmond: /etc/ganglia/gmond.conf:53: no such option 'host'
 Parse error for '/etc/ganglia/gmond.conf'
 [FAILED]
 
 I'm a little puzzled by this.  Could someone point me in the right direction?

man gmond.conf would show you there is no host option for udp_recv_channel
but probably the option you are looking at is bind which will tell ganglia
to bind to a specific IP for the unicast listener.

Carlo

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia + Windows - Compilation Problems?

2010-06-26 Thread Carlo Marcelo Arenas Belon
On Wed, Jun 23, 2010 at 01:40:52PM -0500, Douglas Wagner wrote:
 
 So I build libconfuse on Cygwin on my local XP development box and it gets
 stuck into /usr/local/* (lib, include, etc.).

is it libconfuse 2.7 compiled as an static library and no nls support as
suggested in README.WIN? is this using cygwin 1.5 on 32bit windows or are
you using 1.7?

 Come back around (according to the README.WIN and tell ganglia to compile
 --with-libconfuse=/usr/local and it blows up telling me it can't find
 libconfuse.

config.log would explain why, but hope it is not that you are trying to
build it for 64bit windows.

 Linking everything into /usr/lib doesn't help either.  I've
 seen docs on this but assumed it was supposed to be fixed in 3.1.2.

not sure what you are referring here, but are you trying to build 3.1.7?
noticed the README.WIN documents are not mentioning the need to override
sysconfdir (which is irrelevant for cygwin anyway) and were not completely
updated when the libpcre dependency was added (which also changed name
recently in cygwin) for that release but used to work at least with 3.1.4
from what I remember and therefore probably also for 3.1.2.

the following seemed to work for me on an updated windows vista laptop I
had access with and with the latest cygwin (mostly using instructions
from README.WIN and against the recommendation of sticking with 1.5,
which will therefore require some patching) :

  $ tar -xvzf confuse-2.7.tar.gz
  $ cd confuse-2.7
  $ ./configure --disable-nls
  $ make
  $ make install
  $ cd ..
  $ tar -xvzf ganglia-3.1.7.tar.gz
  $ cd ganglia-3.1.7
  $ find . -type f -name *.h -a ! -name config.h -exec fgrep -l 
rpc/rpc.h {} \; | xargs -n1 perl -pi -e s;#include rpc/rpc.h;#include 
cygwin/in.h\n#include rpc/rpc.h;g
  $ ./configure GANGLIA_ACK_SYSCONFDIR=1 --with-libconfuse=/usr/local 
--enable-static-build
  $ make
  $ cd ..
  $ mkdir dist
  $ cp -a ganglia-3.1.7/gmond/gmond.exe dist/
  $ cp -a ganglia-3.1.7/gmetric/gmetric.exe dist/
  $ cp -a ganglia-3.1.7/gstat/gstat.exe dist/
  $ cd confuse
  $ make uninstall
  $ cd ..
  $ rm -rf confuse* ganglia*

the binaries in dist will need to be installed in the other nodes probably
including the corresponding cygwin dll that they were built with if cygwin
won't be installed independently (cygwin1.dll, cygapr-1-0.dll, cygexpat-1.dll,
cygpcre-0.dll, and libpython2.6.dll).

the following dependencies were installed as prerequisites on the system 
that was used for building this package (listed with `cygcheck.exe -c -d`) :

  diffutils2.9-1
  expat2.0.1-1
  libexpat12.0.1-1
  libexpat1-devel  2.0.1-1
  gcc  3.4.4-999
  gcc-core 3.4.4-999
  gcc-g++  3.4.4-999
  gcc-mingw-core   20050522-1
  gcc-mingw-g++20050522-1
  libgcc1  4.3.4-3
  libapr1  1.4.2-1
  libapr1-devel1.4.2-1
  make 3.81-2
  libpcre-devel8.02-1
  libpcre0 8.02-1
  python   2.6.5-2
  sharutils4.8-1
  sunrpc   4.0-3

Carlo

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] ganglia monitors does not work for some of the clusters

2010-06-26 Thread Carlo Marcelo Arenas Belon
On Thu, Jun 24, 2010 at 07:37:25AM +0200, Raimund Eimann wrote:
 
 I have exactly the same issue with version 3.1.7. When I restart gmond on
 the affected nodes, their graphs work for some time (1-2 days typically). I
 use CentOS 5.{4,5} on my nodes. Usually the problem does not affect a
 cluster as a whole, but only a large number of nodes in the cluster (for
 insance, for 14 out of 17 nodes nothing gets displayed).

are you using multicast or unicast? does setting send_metadata_interval to
60 or some other non zero value help?

Carlo

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Gmond udp_send_channel using the wrong network (seems hostname related)

2010-06-26 Thread Carlo Marcelo Arenas Belon
On Thu, Jun 24, 2010 at 10:21:53AM +, Ronny wrote:
 
 I am facing the problem, that my gmond udp_send_channels sends via the wrong 
 network interface on a multi homed linux machine.

there is some information on multihomed setups in the README which could
help.

 The machines have a front NIC and an backend NIC. Both IPs from the NICs get 
 resolved by the name service, but the primary IP's dns name is the system's 
 hostname (with an IP address out of 62.48.x.x)
 
 In my clients gmond.conf I have set:
 
 udp_send_channel {
   bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname.  Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
   host = 10.0.11.16
   port = 8649
   ttl = 1
 
 }
 
 whereby 10.0.11.16 is the backend network.
 
 But this gmond seems to ignore to use 10.0.11.16 and sends via the
 primary IP adress 62.48.x.x to the udp_receive_channel locatet on
 another host. A firewall between send_channel and receiver channel
 machines using 62.48.x.x is blocking that traffic. I can't currently
 open the firewall.

that is what bind_hostname is meant to do AFAIK, maybe you would like to use
instead bind = 10.0.11.16 (host should point to your collector if using
unicast, so host and bind should be most of the time different ips in
10.0.11.x unlike this example)

Carlo

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Gmond udp_send_channel using the wrong network (seems hostname related)

2010-06-26 Thread Vladimir Vuksan
Sounds to me like your routing is not properly set although apparently
that can depend on an OS. More than 4 years ago I reported a bug
regarding gmond not honoring mcast_If setting 

http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=94

We resolved it by adding a route. It would seem that in unicast mode
this should require no changes. Can you send us what your routing table
looks like ?

U Čet, 24. 06. 2010., u 10:21 +, Ronny je napisao/la:

 I am facing the problem, that my gmond udp_send_channels sends via the wrong 
 network interface on a multi homed linux machine.
 
 The machines have a front NIC and an backend NIC. Both IPs from the NICs get 
 resolved by the name service, but the primary IP's dns name is the system's 
 hostname (with an IP address out of 62.48.x.x)
 
 In my clients gmond.conf I have set:
 
 udp_send_channel {
   bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname.  Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
   host = 10.0.11.16
   port = 8649
   ttl = 1
 
 }
 
 whereby 10.0.11.16 is the backend network.
 
 But this gmond seems to ignore to use 10.0.11.16 and sends via the primary IP 
 adress 62.48.x.x to the udp_receive_channel locatet on another host. A 
 firewall between send_channel and receiver channel machines using 62.48.x.x 
 is blocking that traffic. I can't currently open the firewall.
 
 What should I do to let gmond communicate exclusively via the 10.0.11.x 
 network?
 
 I am running ganglia 3.1.7.


--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Gmond udp_send_channel using the wrong network (seems hostname related)

2010-06-26 Thread Carlo Marcelo Arenas Belon
On Sat, Jun 26, 2010 at 03:29:17PM -0400, Vladimir Vuksan wrote:

 More than 4 years ago I reported a bug regarding gmond not honoring
 mcast_If setting 
 
   http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=94

mcast_if should be working fine in 3.0 since 3.0.5, could you confirm
that? now you should be able to force multicast traffic to go through
a specific interface if adding mcast_if into the corresponding
udp_send_channel setting.
 
it was broken again though in 3.1 and while it was fixed again for
3.1.2 as shown by BUG140 you would need 3.1.7 for a full fix and set
of directives that are meant to help control all parts of functionality
including also the IP that would be used as the source (which is what
bind and bind_hostname are for) independently of the interface or IPv4
routing.

 We resolved it by adding a route. It would seem that in unicast mode
 this should require no changes. Can you send us what your routing table
 looks like ?

unicast could use a different IP as the source if instructed to do so
by explicitally binding to it or to the resolvable hostname as it
seemed by the original reported configuration.

agree though documentation is a little thin around of all it (there is
also some complementary explanation in the README) specially with 3.1.7
which has now several overriding settings that affect this (routing, 
mcast_if, and bind/bind_hostname)

Carlo

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general