Plz share your configs via pastbin 

Cheers, 

On December 4, 2014 9:06:08 PM CET, Chris Jones <christopher.r.jo...@nasa.gov> 
wrote:
>
>I'm still racking my brain with this problem I'm having.  I've even ran
>
>'tcpdump -i any port 8204' on my gmetad server and watched the 
>traffic.... when I've got two gmond clients sending out multicast 
>packets on port 8204 I can see handshaking between my server and *one* 
>client.  The other client via the tcpdump just shows packets being sent
>
>out - and no replying.  On the server gui, I see only the one client 
>showing up.
>
>I then stop gmond on the client that's 'working' and immediately on my 
>other client, the tcpdump output changes to handshaking between the 
>client and server - and the server's tcpdump also then changes to show 
>the new client (the old one stops).  Then eventually on the server gui
>I 
>stop seeing the old client updating (the icon for the host turns that 
>block of red... 'host down') and my new client shows up like nothing 
>ever happened.
>
>This makes no sense.  I don't believe I've oversubscribed the number of
>
>gmond's on my server (around 150 maybe?).  The gmetad server is running
>
>RHEL 6.2, and my two gmond clients are running RHEL 6.5.  The strange 
>thing is, it appears that only my RHEL 6.5 clients are having this 
>problem..... every other gmond client is either RHEL 5.x or SuSE 11.1
>or 
>11.2.
>
>I've googled this problem til I'm blue in the face, gone back through 
>the last few years of the ganglia-general mailing list archives as best
>
>I could with keyword searches, consulted many of my system admin. 
>co-workers, and even tried using unicast instead of multicast (that 
>didn't make a difference either).  Nothing seems to matter.
>
>There's got to be somebody out there reading this mailing list who's
>got 
>RHEL6.5 gmond clients.  Anybody?  Please?  :)
>
>Thanks,
>-chris
>
>On 9/4/14, 12:46 PM, Karol Korytkowski wrote:
>> I'm curious as of what the correct answer would be, but..
>>
>> We have similar problem (forgive if not, I just scanned through your
>> email), and some kind of solution was to use different data_source
>> (@gmetad) for each of such issues and give them same cluster { name =
>> "xxxx" }  (@gmond).
>>
>> I think this has something to do with multicasts between switches,
>but
>> so far noone has looked into this..
>>
>> KK
>>
>>
>> On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones
>> <christopher.r.jo...@nasa.gov <mailto:christopher.r.jo...@nasa.gov>>
>wrote:
>>
>>
>>     Here's my scenario.  I've got some systems that were happily
>reporting
>>     in ganglia and they had to have their OS'es rebuilt.  They're now
>>     running RHEL 6.5.
>>
>>     I can be on my gmetad server, and tcpdump looking for packets
>from host1
>>     and host2 and only see one.  Both host1 & host2 are running with
>the
>>     exact same gmond.conf configuration... same port.   They both
>appear to
>>     be running correctly.  But one shows more activity than the other
>when I
>>     run a 'netstat -an | grep 8204'  (8204 is the port they run on). 
> When
>>     I run 'telnet localhost 8204' on them both, they show me all the
>xml
>>     data that they're sending out.  Both gmond clients are sending
>their
>>     multicast traffic across the same network also.
>>
>>     But the server only seems to want to pick up one at a time.  In
>my
>>     gmetad.conf file, the data_source line for this port only has two
>>     entries... host1:8204 host2:8204 (and these hosts are the fully
>>     qualified domain names... on the same network that the two hosts
>are
>>     sending their multicast across on).   I can have both gmond's
>running
>>     but only one seems to generate all the tcp  connections (like you
>see
>>     via 'netstat -an | grep 8204') where the other one doesn't.  The
>one
>>     that does is the one I see on my gmetad server.
>>
>>     On the gmetad server, I can run tcpdump on the appropriate
>network
>>     interface and look for traffic coming from my host1 and host2.  I
>can
>>     only see one at a time.  I should see both my hosts.  I make that
>>     assumption because I can run that same type of command on another
>port
>>     for other hosts that are on it and get back results.... lots of
>>     different hosts showing up because I have lots of hosts on that
>>     particular port.
>>
>>     Here's what I'm guessing are the relevant entries from the
>gmond.conf
>>     file on my two hosts in question:
>>
>>     /* The host section describes attributes of the host, like the
>>     location */
>>     host {
>>         location = "unspecified"
>>     }
>>
>>     /* Feel free to specify as many udp_send_channels as you like. 
>Gmond
>>          used to only support having a single channel */
>>     udp_send_channel {
>>         #bind_hostname = yes # Highly recommended, soon to be
>default.
>>                              # This option tells gmond to use a
>source
>>     address
>>                              # that resolves to the machine's
>hostname.
>>     Without
>>                              # this, the metrics may appear to come
>from any
>>                              # interface and the DNS names associated
>with
>>                              # those IPs will be used to create the
>RRDs.
>>         mcast_join = 239.2.11.71
>>         port = 8204
>>         ttl = 1
>>     }
>>
>>     /* You can specify as many udp_recv_channels as you like as well.
>*/
>>     udp_recv_channel {
>>         mcast_join = 239.2.11.71
>>         port = 8204
>>         bind = 239.2.11.71
>>     }
>>
>>     /* You can specify as many tcp_accept_channels as you like to
>share
>>          an xml description of the state of the cluster */
>>     tcp_accept_channel {
>>         port = 8204
>>     }
>>
>>
>>     Any insight would be appreciated.  :)
>>
>>     Thanks,
>>     -chris
>>
>>     --
>>     Chris Jones
>>     SSAI - ASDC Senior Systems Administrator
>>     ----------------------------------------
>>     Note to self: Insert cool signature here.
>>
>>    
>------------------------------------------------------------------------------
>>     Slashdot TV.
>>     Video for Nerds.  Stuff that matters.
>>     http://tv.slashdot.org/
>>     _______________________________________________
>>     Ganglia-general mailing list
>>     Ganglia-general@lists.sourceforge.net
>>     <mailto:Ganglia-general@lists.sourceforge.net>
>>     https://lists.sourceforge.net/lists/listinfo/ganglia-general
>>
>>
>
>-- 
>Chris Jones
>SSAI - ASDC Senior Systems Administrator
>----------------------------------------
>Note to self: Insert cool signature here.
>
>------------------------------------------------------------------------------
>Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>from Actuate! Instantly Supercharge Your Business Reports and
>Dashboards
>with Interactivity, Sharing, Native Excel Exports, App Integration &
>more
>Get technology previously reserved for billion-dollar corporations,
>FREE
>http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
>_______________________________________________
>Ganglia-general mailing list
>Ganglia-general@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/ganglia-general

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to