Plz share your configs via pastbin
Cheers,
On December 4, 2014 9:06:08 PM CET, Chris Jones <christopher.r.jo...@nasa.gov>
wrote:
>
>I'm still racking my brain with this problem I'm having. I've even ran
>
>'tcpdump -i any port 8204' on my gmetad server and watched the
>traffic.... when I've got two gmond clients sending out multicast
>packets on port 8204 I can see handshaking between my server and *one*
>client. The other client via the tcpdump just shows packets being sent
>
>out - and no replying. On the server gui, I see only the one client
>showing up.
>
>I then stop gmond on the client that's 'working' and immediately on my
>other client, the tcpdump output changes to handshaking between the
>client and server - and the server's tcpdump also then changes to show
>the new client (the old one stops). Then eventually on the server gui
>I
>stop seeing the old client updating (the icon for the host turns that
>block of red... 'host down') and my new client shows up like nothing
>ever happened.
>
>This makes no sense. I don't believe I've oversubscribed the number of
>
>gmond's on my server (around 150 maybe?). The gmetad server is running
>
>RHEL 6.2, and my two gmond clients are running RHEL 6.5. The strange
>thing is, it appears that only my RHEL 6.5 clients are having this
>problem..... every other gmond client is either RHEL 5.x or SuSE 11.1
>or
>11.2.
>
>I've googled this problem til I'm blue in the face, gone back through
>the last few years of the ganglia-general mailing list archives as best
>
>I could with keyword searches, consulted many of my system admin.
>co-workers, and even tried using unicast instead of multicast (that
>didn't make a difference either). Nothing seems to matter.
>
>There's got to be somebody out there reading this mailing list who's
>got
>RHEL6.5 gmond clients. Anybody? Please? :)
>
>Thanks,
>-chris
>
>On 9/4/14, 12:46 PM, Karol Korytkowski wrote:
>> I'm curious as of what the correct answer would be, but..
>>
>> We have similar problem (forgive if not, I just scanned through your
>> email), and some kind of solution was to use different data_source
>> (@gmetad) for each of such issues and give them same cluster { name =
>> "xxxx" } (@gmond).
>>
>> I think this has something to do with multicasts between switches,
>but
>> so far noone has looked into this..
>>
>> KK
>>
>>
>> On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones
>> <christopher.r.jo...@nasa.gov <mailto:christopher.r.jo...@nasa.gov>>
>wrote:
>>
>>
>> Here's my scenario. I've got some systems that were happily
>reporting
>> in ganglia and they had to have their OS'es rebuilt. They're now
>> running RHEL 6.5.
>>
>> I can be on my gmetad server, and tcpdump looking for packets
>from host1
>> and host2 and only see one. Both host1 & host2 are running with
>the
>> exact same gmond.conf configuration... same port. They both
>appear to
>> be running correctly. But one shows more activity than the other
>when I
>> run a 'netstat -an | grep 8204' (8204 is the port they run on).
> When
>> I run 'telnet localhost 8204' on them both, they show me all the
>xml
>> data that they're sending out. Both gmond clients are sending
>their
>> multicast traffic across the same network also.
>>
>> But the server only seems to want to pick up one at a time. In
>my
>> gmetad.conf file, the data_source line for this port only has two
>> entries... host1:8204 host2:8204 (and these hosts are the fully
>> qualified domain names... on the same network that the two hosts
>are
>> sending their multicast across on). I can have both gmond's
>running
>> but only one seems to generate all the tcp connections (like you
>see
>> via 'netstat -an | grep 8204') where the other one doesn't. The
>one
>> that does is the one I see on my gmetad server.
>>
>> On the gmetad server, I can run tcpdump on the appropriate
>network
>> interface and look for traffic coming from my host1 and host2. I
>can
>> only see one at a time. I should see both my hosts. I make that
>> assumption because I can run that same type of command on another
>port
>> for other hosts that are on it and get back results.... lots of
>> different hosts showing up because I have lots of hosts on that
>> particular port.
>>
>> Here's what I'm guessing are the relevant entries from the
>gmond.conf
>> file on my two hosts in question:
>>
>> /* The host section describes attributes of the host, like the
>> location */
>> host {
>> location = "unspecified"
>> }
>>
>> /* Feel free to specify as many udp_send_channels as you like.
>Gmond
>> used to only support having a single channel */
>> udp_send_channel {
>> #bind_hostname = yes # Highly recommended, soon to be
>default.
>> # This option tells gmond to use a
>source
>> address
>> # that resolves to the machine's
>hostname.
>> Without
>> # this, the metrics may appear to come
>from any
>> # interface and the DNS names associated
>with
>> # those IPs will be used to create the
>RRDs.
>> mcast_join = 239.2.11.71
>> port = 8204
>> ttl = 1
>> }
>>
>> /* You can specify as many udp_recv_channels as you like as well.
>*/
>> udp_recv_channel {
>> mcast_join = 239.2.11.71
>> port = 8204
>> bind = 239.2.11.71
>> }
>>
>> /* You can specify as many tcp_accept_channels as you like to
>share
>> an xml description of the state of the cluster */
>> tcp_accept_channel {
>> port = 8204
>> }
>>
>>
>> Any insight would be appreciated. :)
>>
>> Thanks,
>> -chris
>>
>> --
>> Chris Jones
>> SSAI - ASDC Senior Systems Administrator
>> ----------------------------------------
>> Note to self: Insert cool signature here.
>>
>>
>------------------------------------------------------------------------------
>> Slashdot TV.
>> Video for Nerds. Stuff that matters.
>> http://tv.slashdot.org/
>> _______________________________________________
>> Ganglia-general mailing list
>> Ganglia-general@lists.sourceforge.net
>> <mailto:Ganglia-general@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>>
>>
>
>--
>Chris Jones
>SSAI - ASDC Senior Systems Administrator
>----------------------------------------
>Note to self: Insert cool signature here.
>
>------------------------------------------------------------------------------
>Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>from Actuate! Instantly Supercharge Your Business Reports and
>Dashboards
>with Interactivity, Sharing, Native Excel Exports, App Integration &
>more
>Get technology previously reserved for billion-dollar corporations,
>FREE
>http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
>_______________________________________________
>Ganglia-general mailing list
>Ganglia-general@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/ganglia-general
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general