Being that I work at NASA, I'd rather not put entire files out there with names of hosts and ports and the like. :) My initial post had in it part of the gmond config's.
My datasource line in my gmetad.conf file (for this one port) is simply something like this: data_source "my_name" gmond_hostA:8204 gmond_hostB:8204 If there's anything else specifically, just ask and I'll give it (with names changed to protect the innocent). -chris On 12/4/14, 3:15 PM, Maciej Lasyk wrote: > Plz share your configs via pastbin > > Cheers, > > On December 4, 2014 9:06:08 PM CET, Chris Jones > <christopher.r.jo...@nasa.gov> wrote: > > > I'm still racking my brain with this problem I'm having. I've even ran > 'tcpdump -i any port 8204' on my gmetad server and watched the > traffic.... when I've got two gmond clients sending out multicast > packets on port 8204 I can see handshaking between my server and *one* > client. The other client via the tcpdump just shows packets being sent > out - and no replying. On the server gui, I see only the one client > showing up. > > I then stop gmond on the client that's 'working' and immediately on my > other client, the tcpdump output changes to handshaking between the > client and server - and the server's tcpdump also then changes to show > the new client (the old one stops). Then eventually on the server gui I > stop seeing the old client updating (the icon for the host turns that > block of red... 'host down') and my new client shows up like nothing > ever happened. > > This mak > es no > sense. I don't believe I've oversubscribed the number of > gmond's on my server (around 150 maybe?). The gmetad server is running > RHEL 6.2, and my two gmond clients are running RHEL 6.5. The strange > thing is, it appears that only my RHEL 6.5 clients are having this > problem..... every other gmond client is either RHEL 5.x or SuSE 11.1 or > 11.2. > > I've googled this problem til I'm blue in the face, gone back through > the last few years of the ganglia-general mailing list archives as best > I could with keyword searches, consulted many of my system admin. > co-workers, and even tried using unicast instead of multicast (that > didn't make a difference either). Nothing seems to matter. > > There's got to be somebody out there reading this mailing list who's got > RHEL6.5 gmond clients. Anybody? Please? :) > > Thanks, > -chris > > On 9/4/14, 12:46 PM, Karol Korytkowski wrote: > > I'm curious as of what the correct answer would be, but.. > > We have similar problem (forgive if not, I just scanned through your > email), and some kind of solution was to use different data_source > (@gmetad) for each of such issues and give them same cluster { > name = > "xxxx" } (@gmond). > > I think this has something to do with multicasts between > switches, but > so far noone has looked into this.. > > KK > > > On Thu, Sep 4, 2014 at 4:59 PM, Chris Jones > <christopher.r.jo...@nasa.gov > <mailto:christopher.r.jo...@nasa.gov>> wrote: > > > Here's my scenario. I've got some systems that were happily > reporting > in ganglia and they had to have their OS'es rebuilt. They're now > running RHEL 6.5. > > I can be on my gmetad server, and tcpdump looking for packet s > from host1 > and host2 and only see one. Both host1 & host2 are running with the > exact same gmond.conf configuration... same port. They both > appear to > be running correctly. But one shows more activity than the other > when I > run a 'netstat -an | grep 8204' (8204 is the port they run on). When > I run 'telnet localhost 8204' on them both, they show me all the xml > data that they're sending out. Both gmond clients are sending their > multicast traffic across the same network also. > > But the server only seems to want to pick up one at a time. In my > gmetad.conf file, the data_source line for this port only has two > entries... host1:8204 host2:8204 (and these hosts are the fully > qualified domain names... on the same network that the two hosts are > sending their multicast across on). I can have both gmond's running > but only one seems to generate all t he tcp connections (like > you see > via 'netstat -an | grep 8204') where the other one doesn't. The one > that does is the one I see on my gmetad server. > > On the gmetad server, I can run tcpdump on the appropriate network > interface and look for traffic coming from my host1 and host2. I can > only see one at a time. I should see both my hosts. I make that > assumption because I can run that same type of command on > another port > for other hosts that are on it and get back results.... lots of > different hosts showing up because I have lots of hosts on that > particular port. > > Here's what I'm guessing are the relevant entries from the > gmond.conf > file on my two hosts in question: > > /* The host section describes attributes of the host, like the > location */ > host { > location = "unspecified" > } > > /* Feel free to s pecify as many udp_send_channels as you like. > Gmond > used to only support having a single channel */ > udp_send_channel { > #bind_hostname = yes # Highly recommended, soon to be default. > # This option tells gmond to use a source > address > # that resolves to the machine's hostname. > Without > # this, the metrics may appear to come from any > # interface and the DNS names associated with > # those IPs will be used to create the RRDs. > mcast_join = 239.2.11.71 <http://239.2.11.71> > port = 8204 > ttl = 1 > } > > /* You can specify as many udp_recv_channels as you like as well. */ > udp_recv_channel { > mcast_join = 239.2.11.71 <http://239.2.11.71> > port = 8 204 > bind = 239.2.11.71 <http://239.2.11.71> > } > > /* You can specify as many tcp_accept_channels as you like to share > an xml description of the state of the cluster */ > tcp_accept_channel { > port = 8204 > } > > > Any insight would be appreciated. :) > > Thanks, > -chris > > -- > Chris Jones > SSAI - ASDC Senior Systems Administrator > > ------------------------------------------------------------------------ > > Note to self: Insert cool signature here. > > > ------------------------------------------------------------------------ > > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > > ------------------------------------------------------------------------ > > Ganglia-general mailing list > Ganglia-general@lists.sourceforge.net > <mailto:Ganglia-general@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/ganglia-general > > > > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Chris Jones SSAI - ASDC Senior Systems Administrator ---------------------------------------- Note to self: Insert cool signature here. ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk _______________________________________________ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general