[Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-09-04 Thread Chris Jones

  
  

Here's my scenario. I've
  got some systems that were happily reporting in ganglia and they
  had to have their OS'es rebuilt. They're now running RHEL 6.5. 
  
  I can be on my gmetad server, and tcpdump looking for packets from
  host1 and host2 and only see one. Both host1  host2 are
  running with the exact same gmond.conf configuration... same
  port. They both appear to be running correctly. But one shows
  more activity than the other when I run a 'netstat -an | grep
  8204' (8204 is the port they run on). When I run 'telnet
  localhost 8204' on them both, they show me all the xml data that
  they're sending out. Both gmond clients are sending their
  multicast traffic across the same network also.
  
  But the server only seems to want to pick up one at a time. In my
  gmetad.conf file, the data_source line for this port only has two
  entries... host1:8204 host2:8204 (and these hosts are the fully
  qualified domain names... on the same network that the two hosts
  are sending their multicast across on). I can have both gmond's
  running but only one seems to generate all the tcp connections
  (like you see via 'netstat -an | grep 8204') where the other one
  doesn't. The one that does is the one I see on my gmetad server.
  
  
  On the gmetad server, I can run tcpdump on the appropriate network
  interface and look for traffic coming from my host1 and host2. I
  can only see one at a time. I should see both my hosts. I make
  that assumption because I can run that same type of command on
  another port for other hosts that are on it and get back
  results lots of different hosts showing up because I have lots
  of hosts on that particular port. 
  
  Here's what I'm guessing are the relevant entries from the
  gmond.conf file on my two hosts in question:
  
  /* The host section describes attributes of the host, like the
  location */
  host {
   location = "unspecified"
  }
  
  /* Feel free to specify as many udp_send_channels as you like.
  Gmond
   used to only support having a single channel */
  udp_send_channel {
   #bind_hostname = yes # Highly recommended, soon to be default.
   # This option tells gmond to use a source
  address
   # that resolves to the machine's hostname.
  Without
   # this, the metrics may appear to come from
  any
   # interface and the DNS names associated
  with
   # those IPs will be used to create the
  RRDs.
   mcast_join = 239.2.11.71
   port = 8204
   ttl = 1
  }
  
  /* You can specify as many udp_recv_channels as you like as well.
  */
  udp_recv_channel {
   mcast_join = 239.2.11.71
   port = 8204
   bind = 239.2.11.71
  }
  
  /* You can specify as many tcp_accept_channels as you like to
  share
   an xml description of the state of the cluster */
  tcp_accept_channel {
   port = 8204
  }
  
  
  Any insight would be appreciated. :)
  
  Thanks,
  -chris
-- 
Chris Jones
SSAI - ASDC Senior Systems Administrator

Note to self: Insert cool signature here.

  


--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-09-04 Thread Chris Jones

Here's my scenario.  I've got some systems that were happily reporting 
in ganglia and they had to have their OS'es rebuilt.  They're now 
running RHEL 6.5.

I can be on my gmetad server, and tcpdump looking for packets from host1 
and host2 and only see one.  Both host1  host2 are running with the 
exact same gmond.conf configuration... same port.   They both appear to 
be running correctly.  But one shows more activity than the other when I 
run a 'netstat -an | grep 8204'  (8204 is the port they run on).   When 
I run 'telnet localhost 8204' on them both, they show me all the xml 
data that they're sending out.  Both gmond clients are sending their 
multicast traffic across the same network also.

But the server only seems to want to pick up one at a time.  In my 
gmetad.conf file, the data_source line for this port only has two 
entries... host1:8204 host2:8204 (and these hosts are the fully 
qualified domain names... on the same network that the two hosts are 
sending their multicast across on).   I can have both gmond's running 
but only one seems to generate all the tcp  connections (like you see 
via 'netstat -an | grep 8204') where the other one doesn't.  The one 
that does is the one I see on my gmetad server.

On the gmetad server, I can run tcpdump on the appropriate network 
interface and look for traffic coming from my host1 and host2.  I can 
only see one at a time.  I should see both my hosts.  I make that 
assumption because I can run that same type of command on another port 
for other hosts that are on it and get back results lots of 
different hosts showing up because I have lots of hosts on that 
particular port.

Here's what I'm guessing are the relevant entries from the gmond.conf 
file on my two hosts in question:

/* The host section describes attributes of the host, like the location */
host {
   location = unspecified
}

/* Feel free to specify as many udp_send_channels as you like.  Gmond
used to only support having a single channel */
udp_send_channel {
   #bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname.  Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
   mcast_join = 239.2.11.71
   port = 8204
   ttl = 1
}

/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
   mcast_join = 239.2.11.71
   port = 8204
   bind = 239.2.11.71
}

/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
tcp_accept_channel {
   port = 8204
}


Any insight would be appreciated.  :)

Thanks,
-chris

-- 
Chris Jones
SSAI - ASDC Senior Systems Administrator

Note to self: Insert cool signature here.

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] gmond's on same multicast port won't communicate at same time

2014-09-04 Thread Karol Korytkowski
I'm curious as of what the correct answer would be, but..

We have similar problem (forgive if not, I just scanned through your
email), and some kind of solution was to use different data_source
(@gmetad) for each of such issues and give them same cluster { name =
 }  (@gmond).

I think this has something to do with multicasts between switches, but so
far noone has looked into this..

KK


On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones christopher.r.jo...@nasa.gov
wrote:


 Here's my scenario.  I've got some systems that were happily reporting
 in ganglia and they had to have their OS'es rebuilt.  They're now
 running RHEL 6.5.

 I can be on my gmetad server, and tcpdump looking for packets from host1
 and host2 and only see one.  Both host1  host2 are running with the
 exact same gmond.conf configuration... same port.   They both appear to
 be running correctly.  But one shows more activity than the other when I
 run a 'netstat -an | grep 8204'  (8204 is the port they run on).   When
 I run 'telnet localhost 8204' on them both, they show me all the xml
 data that they're sending out.  Both gmond clients are sending their
 multicast traffic across the same network also.

 But the server only seems to want to pick up one at a time.  In my
 gmetad.conf file, the data_source line for this port only has two
 entries... host1:8204 host2:8204 (and these hosts are the fully
 qualified domain names... on the same network that the two hosts are
 sending their multicast across on).   I can have both gmond's running
 but only one seems to generate all the tcp  connections (like you see
 via 'netstat -an | grep 8204') where the other one doesn't.  The one
 that does is the one I see on my gmetad server.

 On the gmetad server, I can run tcpdump on the appropriate network
 interface and look for traffic coming from my host1 and host2.  I can
 only see one at a time.  I should see both my hosts.  I make that
 assumption because I can run that same type of command on another port
 for other hosts that are on it and get back results lots of
 different hosts showing up because I have lots of hosts on that
 particular port.

 Here's what I'm guessing are the relevant entries from the gmond.conf
 file on my two hosts in question:

 /* The host section describes attributes of the host, like the location */
 host {
location = unspecified
 }

 /* Feel free to specify as many udp_send_channels as you like.  Gmond
 used to only support having a single channel */
 udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
 # This option tells gmond to use a source address
 # that resolves to the machine's hostname.  Without
 # this, the metrics may appear to come from any
 # interface and the DNS names associated with
 # those IPs will be used to create the RRDs.
mcast_join = 239.2.11.71
port = 8204
ttl = 1
 }

 /* You can specify as many udp_recv_channels as you like as well. */
 udp_recv_channel {
mcast_join = 239.2.11.71
port = 8204
bind = 239.2.11.71
 }

 /* You can specify as many tcp_accept_channels as you like to share
 an xml description of the state of the cluster */
 tcp_accept_channel {
port = 8204
 }


 Any insight would be appreciated.  :)

 Thanks,
 -chris

 --
 Chris Jones
 SSAI - ASDC Senior Systems Administrator
 
 Note to self: Insert cool signature here.


 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general