Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-12-08 Thread Chris Jones

Wow... I didn't know there was an O'Reilly book on Ganglia!  I will look 
into that for sure.  Thanks for mentioning it.

To answer your questions:

1. deaf and mute are set to 'no'.  that must be the default setting 
since I've never messed with those settings before in all my years of 
working with ganglia.

2. Based on my answer to #1, I suppose that yes all my gmond hosts are 
aggregators.  So the total number of gmond's is 150 right now, and I've 
got a got about another 20 that I'm trying to bring online (most of 
these systems have recently been rebuilt as RHEL 6.5 systems - and 
installing gmond on them is where my problem started).

3. You didn't say *where* to issue the telnet from.  I know that I can 
be logged into one of my gmond's (that is working) and 'telnet localhost 
port' and see a 'HOST NAME' line for itself and the other gmond's that 
share the same port.  I see the same thing for the other ports that my 
gmond's are grouped on.  Now for the gmond's that I'm having problems 
with, I only see one 'HOST NAME' per gmond.  They're not seeing their 
gmond buddies.  Picking one of those, the size of the XML content is 
13Kb.  When I tried this on a different gmond where it *does* see it's 
fellow gmond's on the same port (total of 41 hosts), the size of the XML 
content was 252Kb.

4. We don't make use of ACL's in anything ganglia related.  So no... 
none set.

5. That definitely was some kind of typo with that extra space since 
I can't even find that.  :)

On 12/4/14, 3:44 PM, Maciej Lasyk wrote:
 Are you afraid that we could see performance data of the Curiosity? :D

 First of all I would really suggest you read the Monitoring with
 Ganglia book (2012). It answers many questions and solves major problems.

 About your issue:

 1. How do you set deaf and mute in gmond nodes?
 2. How many listening gmonds (aggregators, hosts with deaf=no) do you
 have? (if using multicast than probably by default all gmond hosts are
 aggregators)
 3. What is the size of the downloaded XML (telnet to gmond aggregator on
 port set in tcp_accept_channel)? Does it contain all hosts you monitor
 (write XML content to file and grep looking for 'HOST NAME' or smt like
 that)
 4. Do you have any ACLs set in gmond configuration?
 5.

 Btw - in the config section you shared you have a white-space in port
 number 8 204:

   /* You can specify as many udp_recv_channels as you like as well. */
  udp_recv_channel {
  mcast_join = 239.2.11.71
  port = 8 204
  bind = 239.2.11.71
  }

 Cheers,
 Maciej Lasyk

 GPG key ID: 4FED49C5
 GPG public key: http://maciek.lasyk.info/gpg_maciej_lasyk.asc

 On Thu, Dec 4, 2014 at 9:20 PM, Chris Jones
 christopher.r.jo...@nasa.gov mailto:christopher.r.jo...@nasa.gov wrote:


 Being that I work at NASA, I'd rather not put entire files out there
 with names of hosts and ports and the like.  :)  My initial post had
 in it part of the gmond config's.

 My datasource line in my gmetad.conf file (for this one port) is
 simply something like this:

 data_source my_name gmond_hostA:8204 gmond_hostB:8204

 If there's anything else specifically, just ask and I'll give it
 (with names changed to protect the innocent).

 -chris


 On 12/4/14, 3:15 PM, Maciej Lasyk wrote:

 Plz share your configs via pastbin

 Cheers,

 On December 4, 2014 9:06:08 PM CET, Chris Jones
 christopher.r.jo...@nasa.gov
 mailto:christopher.r.jo...@nasa.gov wrote:


  I'm still racking my brain with this problem I'm having.
 I've even ran
  'tcpdump -i any port 8204' on my gmetad server and watched the
  traffic when I've got two gmond clients sending out
 multicast
  packets on port 8204 I can see handshaking between my
 server and *one*
  client.  The other client via the tcpdump just shows
 packets being sent
  out - and no replying.  On the server gui, I see only the
 one client
  showing up.

  I then stop gmond on the client that's 'working' and
 immediately on my
  other client, the tcpdump output changes to handshaking
 between the
  client and server - and the server's tcpdump also then
 changes to show
  the new client (the old one stops).  Then eventually on the
 server gui I
  stop seeing the old client updating (the icon for the host
 turns that
  block of red... 'host down') and my new client shows up
 like nothing
  ever happened.

  This mak
es no
  sense.  I don't believe I've oversubscribed the number of
  gmond's on my server (around 150 maybe?).  The gmetad
 server is running
  RHEL 6.2, and my two gmond clients are running RHEL 6.5.
 The strange
   

Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-12-04 Thread Chris Jones

  
  

I'm still racking my brain with this problem I'm having.  I've even
ran 'tcpdump -i any port 8204' on my gmetad server and watched the
traffic when I've got two gmond clients sending out multicast
packets on port 8204 I can see handshaking between my server and
*one* client.  The other client via the tcpdump just shows packets
being sent out - and no replying.  On the server gui, I see only the
one client showing up.  

I then stop gmond on the client that's 'working' and immediately on
my other client, the tcpdump output changes to handshaking between
the client and server - and the server's tcpdump also then changes
to show the new client (the old one stops).  Then eventually on the
server gui I stop seeing the old client updating (the icon for the
host turns that block of red... 'host down') and my new client shows
up like nothing ever happened.

This makes no sense.  I don't believe I've oversubscribed the number
of gmond's on my server (around 150 maybe?).  The gmetad server is
running RHEL 6.2, and my two gmond clients are running RHEL 6.5. 
The strange thing is, it appears that only my RHEL 6.5 clients are
having this problem. every other gmond client is either RHEL 5.x
or SuSE 11.1 or 11.2.  

I've googled this problem til I'm blue in the face, gone back
through the last few years of the ganglia-general mailing list
archives as best I could with keyword searches, consulted many of my
system admin. co-workers, and even tried using unicast instead of
multicast (that didn't make a difference either).  Nothing seems to
matter.  

There's got to be somebody out there reading this mailing list who's
got RHEL6.5 gmond clients.  Anybody?  Please?  :) 

Thanks,
-chris

On 9/4/14, 12:46 PM, Karol Korytkowski
  wrote:


  
  

  
I'm curious as of what the correct answer would be,
  but.. 
  
  We have similar problem (forgive if not, I just scanned
  through your email), and some kind of solution was to use
  different data_source (@gmetad) for each of such issues
  and give them same cluster { name = "" }  (@gmond).


  
  I think this has something to do with multicasts between
  switches, but so far noone has looked into this..
  

KK
  
  

On Thu, Sep 4, 2014 at 4:59 PM, Chris
  Jones christopher.r.jo...@nasa.gov
  wrote:
  

  
Here's my scenario.  I've got some systems that were
happily reporting
in ganglia and they had to have their OS'es rebuilt. 
They're now
running RHEL 6.5.

I can be on my gmetad server, and tcpdump looking for
packets from host1
and host2 and only see one.  Both host1  host2 are
running with the
exact same gmond.conf configuration... same port.   They
both appear to
be running correctly.  But one shows more activity than
the other when I
run a 'netstat -an | grep 8204'  (8204 is the port they
run on).   When
I run 'telnet localhost 8204' on them both, they show me
all the xml
data that they're sending out.  Both gmond clients are
sending their
multicast traffic across the same network also.

But the server only seems to want to pick up one at a
time.  In my
gmetad.conf file, the data_source line for this port
only has two
entries... host1:8204 host2:8204 (and these hosts are
the fully
qualified domain names... on the same network that the
two hosts are
sending their multicast across on).   I can have both
gmond's running
but only one seems to generate all the tcp  connections
(like you see
via 'netstat -an | grep 8204') where the other one
doesn't.  The one
that does is the one I see on my gmetad server.

On the gmetad server, I can run tcpdump on the
appropriate network
interface and look for traffic coming from my host1 and
host2.  I can
only see one at a time.  I should see both my hosts.  I
make that
assumption because I can run that same type of command
on another port
for other hosts that are on it and get 

Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-12-04 Thread Chris Jones

I'm still racking my brain with this problem I'm having.  I've even ran 
'tcpdump -i any port 8204' on my gmetad server and watched the 
traffic when I've got two gmond clients sending out multicast 
packets on port 8204 I can see handshaking between my server and *one* 
client.  The other client via the tcpdump just shows packets being sent 
out - and no replying.  On the server gui, I see only the one client 
showing up.

I then stop gmond on the client that's 'working' and immediately on my 
other client, the tcpdump output changes to handshaking between the 
client and server - and the server's tcpdump also then changes to show 
the new client (the old one stops).  Then eventually on the server gui I 
stop seeing the old client updating (the icon for the host turns that 
block of red... 'host down') and my new client shows up like nothing 
ever happened.

This makes no sense.  I don't believe I've oversubscribed the number of 
gmond's on my server (around 150 maybe?).  The gmetad server is running 
RHEL 6.2, and my two gmond clients are running RHEL 6.5.  The strange 
thing is, it appears that only my RHEL 6.5 clients are having this 
problem. every other gmond client is either RHEL 5.x or SuSE 11.1 or 
11.2.

I've googled this problem til I'm blue in the face, gone back through 
the last few years of the ganglia-general mailing list archives as best 
I could with keyword searches, consulted many of my system admin. 
co-workers, and even tried using unicast instead of multicast (that 
didn't make a difference either).  Nothing seems to matter.

There's got to be somebody out there reading this mailing list who's got 
RHEL6.5 gmond clients.  Anybody?  Please?  :)

Thanks,
-chris

On 9/4/14, 12:46 PM, Karol Korytkowski wrote:
 I'm curious as of what the correct answer would be, but..

 We have similar problem (forgive if not, I just scanned through your
 email), and some kind of solution was to use different data_source
 (@gmetad) for each of such issues and give them same cluster { name =
  }  (@gmond).

 I think this has something to do with multicasts between switches, but
 so far noone has looked into this..

 KK


 On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones
 christopher.r.jo...@nasa.gov mailto:christopher.r.jo...@nasa.gov wrote:


 Here's my scenario.  I've got some systems that were happily reporting
 in ganglia and they had to have their OS'es rebuilt.  They're now
 running RHEL 6.5.

 I can be on my gmetad server, and tcpdump looking for packets from host1
 and host2 and only see one.  Both host1  host2 are running with the
 exact same gmond.conf configuration... same port.   They both appear to
 be running correctly.  But one shows more activity than the other when I
 run a 'netstat -an | grep 8204'  (8204 is the port they run on).   When
 I run 'telnet localhost 8204' on them both, they show me all the xml
 data that they're sending out.  Both gmond clients are sending their
 multicast traffic across the same network also.

 But the server only seems to want to pick up one at a time.  In my
 gmetad.conf file, the data_source line for this port only has two
 entries... host1:8204 host2:8204 (and these hosts are the fully
 qualified domain names... on the same network that the two hosts are
 sending their multicast across on).   I can have both gmond's running
 but only one seems to generate all the tcp  connections (like you see
 via 'netstat -an | grep 8204') where the other one doesn't.  The one
 that does is the one I see on my gmetad server.

 On the gmetad server, I can run tcpdump on the appropriate network
 interface and look for traffic coming from my host1 and host2.  I can
 only see one at a time.  I should see both my hosts.  I make that
 assumption because I can run that same type of command on another port
 for other hosts that are on it and get back results lots of
 different hosts showing up because I have lots of hosts on that
 particular port.

 Here's what I'm guessing are the relevant entries from the gmond.conf
 file on my two hosts in question:

 /* The host section describes attributes of the host, like the
 location */
 host {
 location = unspecified
 }

 /* Feel free to specify as many udp_send_channels as you like.  Gmond
  used to only support having a single channel */
 udp_send_channel {
 #bind_hostname = yes # Highly recommended, soon to be default.
  # This option tells gmond to use a source
 address
  # that resolves to the machine's hostname.
 Without
  # this, the metrics may appear to come from any
  # interface and the DNS names associated with
  # those IPs will be used to create the RRDs.
 mcast_join = 239.2.11.71
 port = 

Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-12-04 Thread Chris Jones

Being that I work at NASA, I'd rather not put entire files out there 
with names of hosts and ports and the like.  :)  My initial post had in 
it part of the gmond config's.

My datasource line in my gmetad.conf file (for this one port) is simply 
something like this:

data_source my_name gmond_hostA:8204 gmond_hostB:8204

If there's anything else specifically, just ask and I'll give it (with 
names changed to protect the innocent).

-chris

On 12/4/14, 3:15 PM, Maciej Lasyk wrote:
 Plz share your configs via pastbin

 Cheers,

 On December 4, 2014 9:06:08 PM CET, Chris Jones
 christopher.r.jo...@nasa.gov wrote:


 I'm still racking my brain with this problem I'm having.  I've even ran
 'tcpdump -i any port 8204' on my gmetad server and watched the
 traffic when I've got two gmond clients sending out multicast
 packets on port 8204 I can see handshaking between my server and *one*
 client.  The other client via the tcpdump just shows packets being sent
 out - and no replying.  On the server gui, I see only the one client
 showing up.

 I then stop gmond on the client that's 'working' and immediately on my
 other client, the tcpdump output changes to handshaking between the
 client and server - and the server's tcpdump also then changes to show
 the new client (the old one stops).  Then eventually on the server gui I
 stop seeing the old client updating (the icon for the host turns that
 block of red... 'host down') and my new client shows up like nothing
 ever happened.

 This mak
   es no
 sense.  I don't believe I've oversubscribed the number of
 gmond's on my server (around 150 maybe?).  The gmetad server is running
 RHEL 6.2, and my two gmond clients are running RHEL 6.5.  The strange
 thing is, it appears that only my RHEL 6.5 clients are having this
 problem. every other gmond client is either RHEL 5.x or SuSE 11.1 or
 11.2.

 I've googled this problem til I'm blue in the face, gone back through
 the last few years of the ganglia-general mailing list archives as best
 I could with keyword searches, consulted many of my system admin.
 co-workers, and even tried using unicast instead of multicast (that
 didn't make a difference either).  Nothing seems to matter.

 There's got to be somebody out there reading this mailing list who's got
 RHEL6.5 gmond clients.  Anybody?  Please?  :)

 Thanks,
 -chris

 On 9/4/14, 12:46 PM, Karol Korytkowski wrote:

 I'm curious as of what the correct answer would be, but..

 We have similar problem (forgive if not, I just scanned through your
 email), and some kind of solution was to use different data_source
 (@gmetad) for each of such issues and give them same cluster {
 name =
  } (@gmond).

 I think this has something to do with multicasts between
 switches, but
 so far noone has looked into this..

 KK


 On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones
 christopher.r.jo...@nasa.gov
 mailto:christopher.r.jo...@nasa.gov wrote:


 Here's my scenario. I've got some systems that were happily
 reporting
 in ganglia and they had to have their OS'es rebuilt. They're now
 running RHEL 6.5.

 I can be on my gmetad server, and tcpdump looking for packet s
 from host1
 and host2 and only see one. Both host1  host2 are running with the
 exact same gmond.conf configuration... same port. They both
 appear to
 be running correctly. But one shows more activity than the other
 when I
 run a 'netstat -an | grep 8204' (8204 is the port they run on). When
 I run 'telnet localhost 8204' on them both, they show me all the xml
 data that they're sending out. Both gmond clients are sending their
 multicast traffic across the same network also.

 But the server only seems to want to pick up one at a time. In my
 gmetad.conf file, the data_source line for this port only has two
 entries... host1:8204 host2:8204 (and these hosts are the fully
 qualified domain names... on the same network that the two hosts are
 sending their multicast across on). I can have both gmond's running
 but only one seems to generate all t he tcp connections (like
 you see
 via 'netstat -an | grep 8204') where the other one doesn't. The one
 that does is the one I see on my gmetad server.

 On the gmetad server, I can run tcpdump on the appropriate network
 interface and look for traffic coming from my host1 and host2. I can
 only see one at a time. I should see both my hosts. I make that
 assumption because I can run that same type of command on
 another port
 for other hosts that are on it and get back results lots of
 different 

Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-12-04 Thread Maciej Lasyk
Are you afraid that we could see performance data of the Curiosity? :D

First of all I would really suggest you read the Monitoring with Ganglia
book (2012). It answers many questions and solves major problems.

About your issue:

1. How do you set deaf and mute in gmond nodes?
2. How many listening gmonds (aggregators, hosts with deaf=no) do you
have? (if using multicast than probably by default all gmond hosts are
aggregators)
3. What is the size of the downloaded XML (telnet to gmond aggregator on
port set in tcp_accept_channel)? Does it contain all hosts you monitor
(write XML content to file and grep looking for 'HOST NAME' or smt like
that)
4. Do you have any ACLs set in gmond configuration?
5.

Btw - in the config section you shared you have a white-space in port
number 8 204:

 /* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8 204
bind = 239.2.11.71
}

Cheers,
Maciej Lasyk

GPG key ID: 4FED49C5
GPG public key: http://maciek.lasyk.info/gpg_maciej_lasyk.asc

On Thu, Dec 4, 2014 at 9:20 PM, Chris Jones christopher.r.jo...@nasa.gov
wrote:


 Being that I work at NASA, I'd rather not put entire files out there with
 names of hosts and ports and the like.  :)  My initial post had in it part
 of the gmond config's.

 My datasource line in my gmetad.conf file (for this one port) is simply
 something like this:

 data_source my_name gmond_hostA:8204 gmond_hostB:8204

 If there's anything else specifically, just ask and I'll give it (with
 names changed to protect the innocent).

 -chris


 On 12/4/14, 3:15 PM, Maciej Lasyk wrote:

 Plz share your configs via pastbin

 Cheers,

 On December 4, 2014 9:06:08 PM CET, Chris Jones
 christopher.r.jo...@nasa.gov wrote:


 I'm still racking my brain with this problem I'm having.  I've even
 ran
 'tcpdump -i any port 8204' on my gmetad server and watched the
 traffic when I've got two gmond clients sending out multicast
 packets on port 8204 I can see handshaking between my server and *one*
 client.  The other client via the tcpdump just shows packets being
 sent
 out - and no replying.  On the server gui, I see only the one client
 showing up.

 I then stop gmond on the client that's 'working' and immediately on my
 other client, the tcpdump output changes to handshaking between the
 client and server - and the server's tcpdump also then changes to show
 the new client (the old one stops).  Then eventually on the server
 gui I
 stop seeing the old client updating (the icon for the host turns that
 block of red... 'host down') and my new client shows up like nothing
 ever happened.

 This mak
   es no
 sense.  I don't believe I've oversubscribed the number of
 gmond's on my server (around 150 maybe?).  The gmetad server is
 running
 RHEL 6.2, and my two gmond clients are running RHEL 6.5.  The strange
 thing is, it appears that only my RHEL 6.5 clients are having this
 problem. every other gmond client is either RHEL 5.x or SuSE 11.1
 or
 11.2.

 I've googled this problem til I'm blue in the face, gone back through
 the last few years of the ganglia-general mailing list archives as
 best
 I could with keyword searches, consulted many of my system admin.
 co-workers, and even tried using unicast instead of multicast (that
 didn't make a difference either).  Nothing seems to matter.

 There's got to be somebody out there reading this mailing list who's
 got
 RHEL6.5 gmond clients.  Anybody?  Please?  :)

 Thanks,
 -chris

 On 9/4/14, 12:46 PM, Karol Korytkowski wrote:

 I'm curious as of what the correct answer would be, but..

 We have similar problem (forgive if not, I just scanned through
 your
 email), and some kind of solution was to use different data_source
 (@gmetad) for each of such issues and give them same cluster {
 name =
  } (@gmond).

 I think this has something to do with multicasts between
 switches, but
 so far noone has looked into this..

 KK


 On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones
 christopher.r.jo...@nasa.gov
 mailto:christopher.r.jo...@nasa.gov wrote:


 Here's my scenario. I've got some systems that were happily
 reporting
 in ganglia and they had to have their OS'es rebuilt. They're now
 running RHEL 6.5.

 I can be on my gmetad server, and tcpdump looking for packet s
 from host1
 and host2 and only see one. Both host1  host2 are running with
 the
 exact same gmond.conf configuration... same port. They both
 appear to
 be running correctly. But one shows more activity than the other
 when I
 run a 'netstat -an | grep 8204' (8204 is the port they run on).
 When
 I run 'telnet localhost 

Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-12-04 Thread Seth T Graham

 On Dec 4, 2014, at 2:06 PM, Chris Jones christopher.r.jo...@nasa.gov wrote:
 
 This makes no sense.  I don't believe I've oversubscribed the number of 
 gmond's on my server (around 150 maybe?).  The gmetad server is running 
 RHEL 6.2, and my two gmond clients are running RHEL 6.5.  The strange 
 thing is, it appears that only my RHEL 6.5 clients are having this 
 problem. every other gmond client is either RHEL 5.x or SuSE 11.1 or 
 11.2.
 
 I've googled this problem til I'm blue in the face, gone back through 
 the last few years of the ganglia-general mailing list archives as best 
 I could with keyword searches, consulted many of my system admin. 
 co-workers, and even tried using unicast instead of multicast (that 
 didn't make a difference either).  Nothing seems to matter.
 
 There's got to be somebody out there reading this mailing list who's got 
 RHEL6.5 gmond clients.  Anybody?  Please?  :)

We have a random array of systems falling somewhere between RHEL5.1 and RHEL6.6 
and we don’t see any issues like you’re describing. Running gmond 3.6.0.. which 
is a little old, but only one dot release behind the latest and greatest.

I am using unicast, but you said you tried that and saw the same issue so I 
don’t really have any suggestions on what to try  next from ganglia’s 
perspective.

150 clients is not oversubscribing ganglia.. we have clusters with 300+ nodes 
in them.

The fact that you can only see one host communicating with the gmetad server at 
a time is pretty suspicious, it points to some kind of network health issue. Do 
netmasks check out? Switch supports multicast properly? Jumbo frames enabled on 
some ports but not others? Is the switch saturated? 




smime.p7s
Description: S/MIME cryptographic signature
--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-09-04 Thread Chris Jones

  
  

Here's my scenario. I've
  got some systems that were happily reporting in ganglia and they
  had to have their OS'es rebuilt. They're now running RHEL 6.5. 
  
  I can be on my gmetad server, and tcpdump looking for packets from
  host1 and host2 and only see one. Both host1  host2 are
  running with the exact same gmond.conf configuration... same
  port. They both appear to be running correctly. But one shows
  more activity than the other when I run a 'netstat -an | grep
  8204' (8204 is the port they run on). When I run 'telnet
  localhost 8204' on them both, they show me all the xml data that
  they're sending out. Both gmond clients are sending their
  multicast traffic across the same network also.
  
  But the server only seems to want to pick up one at a time. In my
  gmetad.conf file, the data_source line for this port only has two
  entries... host1:8204 host2:8204 (and these hosts are the fully
  qualified domain names... on the same network that the two hosts
  are sending their multicast across on). I can have both gmond's
  running but only one seems to generate all the tcp connections
  (like you see via 'netstat -an | grep 8204') where the other one
  doesn't. The one that does is the one I see on my gmetad server.
  
  
  On the gmetad server, I can run tcpdump on the appropriate network
  interface and look for traffic coming from my host1 and host2. I
  can only see one at a time. I should see both my hosts. I make
  that assumption because I can run that same type of command on
  another port for other hosts that are on it and get back
  results lots of different hosts showing up because I have lots
  of hosts on that particular port. 
  
  Here's what I'm guessing are the relevant entries from the
  gmond.conf file on my two hosts in question:
  
  /* The host section describes attributes of the host, like the
  location */
  host {
   location = "unspecified"
  }
  
  /* Feel free to specify as many udp_send_channels as you like.
  Gmond
   used to only support having a single channel */
  udp_send_channel {
   #bind_hostname = yes # Highly recommended, soon to be default.
   # This option tells gmond to use a source
  address
   # that resolves to the machine's hostname.
  Without
   # this, the metrics may appear to come from
  any
   # interface and the DNS names associated
  with
   # those IPs will be used to create the
  RRDs.
   mcast_join = 239.2.11.71
   port = 8204
   ttl = 1
  }
  
  /* You can specify as many udp_recv_channels as you like as well.
  */
  udp_recv_channel {
   mcast_join = 239.2.11.71
   port = 8204
   bind = 239.2.11.71
  }
  
  /* You can specify as many tcp_accept_channels as you like to
  share
   an xml description of the state of the cluster */
  tcp_accept_channel {
   port = 8204
  }
  
  
  Any insight would be appreciated. :)
  
  Thanks,
  -chris
-- 
Chris Jones
SSAI - ASDC Senior Systems Administrator

Note to self: Insert cool signature here.

  


--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-09-04 Thread Chris Jones

Here's my scenario.  I've got some systems that were happily reporting 
in ganglia and they had to have their OS'es rebuilt.  They're now 
running RHEL 6.5.

I can be on my gmetad server, and tcpdump looking for packets from host1 
and host2 and only see one.  Both host1  host2 are running with the 
exact same gmond.conf configuration... same port.   They both appear to 
be running correctly.  But one shows more activity than the other when I 
run a 'netstat -an | grep 8204'  (8204 is the port they run on).   When 
I run 'telnet localhost 8204' on them both, they show me all the xml 
data that they're sending out.  Both gmond clients are sending their 
multicast traffic across the same network also.

But the server only seems to want to pick up one at a time.  In my 
gmetad.conf file, the data_source line for this port only has two 
entries... host1:8204 host2:8204 (and these hosts are the fully 
qualified domain names... on the same network that the two hosts are 
sending their multicast across on).   I can have both gmond's running 
but only one seems to generate all the tcp  connections (like you see 
via 'netstat -an | grep 8204') where the other one doesn't.  The one 
that does is the one I see on my gmetad server.

On the gmetad server, I can run tcpdump on the appropriate network 
interface and look for traffic coming from my host1 and host2.  I can 
only see one at a time.  I should see both my hosts.  I make that 
assumption because I can run that same type of command on another port 
for other hosts that are on it and get back results lots of 
different hosts showing up because I have lots of hosts on that 
particular port.

Here's what I'm guessing are the relevant entries from the gmond.conf 
file on my two hosts in question:

/* The host section describes attributes of the host, like the location */
host {
   location = unspecified
}

/* Feel free to specify as many udp_send_channels as you like.  Gmond
used to only support having a single channel */
udp_send_channel {
   #bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname.  Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
   mcast_join = 239.2.11.71
   port = 8204
   ttl = 1
}

/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
   mcast_join = 239.2.11.71
   port = 8204
   bind = 239.2.11.71
}

/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
tcp_accept_channel {
   port = 8204
}


Any insight would be appreciated.  :)

Thanks,
-chris

-- 
Chris Jones
SSAI - ASDC Senior Systems Administrator

Note to self: Insert cool signature here.

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-09-04 Thread Karol Korytkowski
I'm curious as of what the correct answer would be, but..

We have similar problem (forgive if not, I just scanned through your
email), and some kind of solution was to use different data_source
(@gmetad) for each of such issues and give them same cluster { name =
 }  (@gmond).

I think this has something to do with multicasts between switches, but so
far noone has looked into this..

KK


On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones christopher.r.jo...@nasa.gov
wrote:


 Here's my scenario.  I've got some systems that were happily reporting
 in ganglia and they had to have their OS'es rebuilt.  They're now
 running RHEL 6.5.

 I can be on my gmetad server, and tcpdump looking for packets from host1
 and host2 and only see one.  Both host1  host2 are running with the
 exact same gmond.conf configuration... same port.   They both appear to
 be running correctly.  But one shows more activity than the other when I
 run a 'netstat -an | grep 8204'  (8204 is the port they run on).   When
 I run 'telnet localhost 8204' on them both, they show me all the xml
 data that they're sending out.  Both gmond clients are sending their
 multicast traffic across the same network also.

 But the server only seems to want to pick up one at a time.  In my
 gmetad.conf file, the data_source line for this port only has two
 entries... host1:8204 host2:8204 (and these hosts are the fully
 qualified domain names... on the same network that the two hosts are
 sending their multicast across on).   I can have both gmond's running
 but only one seems to generate all the tcp  connections (like you see
 via 'netstat -an | grep 8204') where the other one doesn't.  The one
 that does is the one I see on my gmetad server.

 On the gmetad server, I can run tcpdump on the appropriate network
 interface and look for traffic coming from my host1 and host2.  I can
 only see one at a time.  I should see both my hosts.  I make that
 assumption because I can run that same type of command on another port
 for other hosts that are on it and get back results lots of
 different hosts showing up because I have lots of hosts on that
 particular port.

 Here's what I'm guessing are the relevant entries from the gmond.conf
 file on my two hosts in question:

 /* The host section describes attributes of the host, like the location */
 host {
location = unspecified
 }

 /* Feel free to specify as many udp_send_channels as you like.  Gmond
 used to only support having a single channel */
 udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
 # This option tells gmond to use a source address
 # that resolves to the machine's hostname.  Without
 # this, the metrics may appear to come from any
 # interface and the DNS names associated with
 # those IPs will be used to create the RRDs.
mcast_join = 239.2.11.71
port = 8204
ttl = 1
 }

 /* You can specify as many udp_recv_channels as you like as well. */
 udp_recv_channel {
mcast_join = 239.2.11.71
port = 8204
bind = 239.2.11.71
 }

 /* You can specify as many tcp_accept_channels as you like to share
 an xml description of the state of the cluster */
 tcp_accept_channel {
port = 8204
 }


 Any insight would be appreciated.  :)

 Thanks,
 -chris

 --
 Chris Jones
 SSAI - ASDC Senior Systems Administrator
 
 Note to self: Insert cool signature here.


 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-08-01 Thread Chris Jones

  
  

Here's my scenario. I've
  got some systems that were happily reporting in ganglia and they
  had to have their OS'es rebuilt. They're now running RHEL 6.5. 
  
  I can be on my gmetad server, and tcpdump looking for packets from
  host1 and host2 and only see one. Both host1  host2 are
  running with the exact same gmond.conf configuration... same
  port. They both appear to be running correctly. But one shows
  more activity than the other when I run a 'netstat -an | grep
  8204' (8204 is the port they run on). When I run 'telnet
  localhost 8204' on them both, they show me all the xml data that
  they're sending out. Both gmond clients are sending their
  multicast traffic across the same network also.
  
  But the server only seems to want to pick up one at a time. In my
  gmetad.conf file, the data_source line for this port only has two
  entries... host1:8204 host2:8204 (and these hosts are the fully
  qualified domain names... on the same network that the two hosts
  are sending their multicast across on). I can have both gmond's
  running but only one seems to generate all the tcp connections
  (like you see via 'netstat -an | grep 8204') where the other one
  doesn't. The one that does is the one I see on my gmetad server.
  
  
  On the gmetad server, I can run tcpdump on the appropriate network
  interface and look for traffic coming from my host1 and host2. I
  can only see one at a time. I should see both my hosts. I make
  that assumption because I can run that same type of command on
  another port for other hosts that are on it and get back
  results lots of different hosts showing up because I have lots
  of hosts on that particular port. 
  
  Here's what I'm guessing are the relevant entries from the
  gmond.conf file on my two hosts in question:
  
  /* The host section describes attributes of the host, like the
  location */
  host {
   location = "unspecified"
  }
  
  /* Feel free to specify as many udp_send_channels as you like.
  Gmond
   used to only support having a single channel */
  udp_send_channel {
   #bind_hostname = yes # Highly recommended, soon to be default.
   # This option tells gmond to use a source
  address
   # that resolves to the machine's hostname.
  Without
   # this, the metrics may appear to come from
  any
   # interface and the DNS names associated
  with
   # those IPs will be used to create the
  RRDs.
   mcast_join = 239.2.11.71
   port = 8204
   ttl = 1
  }
  
  /* You can specify as many udp_recv_channels as you like as well.
  */
  udp_recv_channel {
   mcast_join = 239.2.11.71
   port = 8204
   bind = 239.2.11.71
  }
  
  /* You can specify as many tcp_accept_channels as you like to
  share
   an xml description of the state of the cluster */
  tcp_accept_channel {
   port = 8204
  }
  
  
  Any insight would be appreciated. :) 
  
  Thanks,
  -chris
  

-- 
Chris Jones
SSAI - ASDC Senior Systems Administrator

Note to self: Insert cool signature here.

  


--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general