Hi Martin,
I guess MC stands for Multicast, and UC for unicast.
a ) Each node has only one network interface
NODE09:/home/admmarc# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window
irtt Iface
172.16.33.0 0.0.0.0 255.255.255.0 U 0 0
0 eth0
0.0.0.0 172.16.33.1 0.0.0.0 UG 0 0
0 eth0
I've solved my situation by disabling MC on the switch, in this way
each MC stream becomes broadcast and reaches the ganglia server. It's
not nice, but it works and I'm on a hurry.
In this actual situation I'm with two different gmond.conf files, on
for each node01-10 and another for nodegpu01. Being node01-10 for a
cluster and nodegpu01 a cluster of one ^_^.
I know it's not the best way to make it work because is flooding all
the network with useless broadcast, but since I haven't managed to set
UC config files I will leave the system as is. I'm following the rule
"it works, don't touch" :-)
Anyway, if someone would like to share his gmond.conf and gmetad.conf
for a single cluster on a single network working on UC mode will be
very nice.
Thanks to all .
Marc
On 10/25/2010 04:02 PM, Martin Knoblauch wrote:
Hi
Marc,
the output of telnet seems to indicate that your "gmond"s indeed only
see their own data. Kind of strange. I have to admit that I have not
used MC configurations for quite some time. UC is so much cleaner in my
opinion. Questions:
a) how many network interfaces do the "nodes"s have?
b) if more than one, to which interface is the MC address bound? If not
the first, you may want to play with "mcast_if".
Output if "ifconfig -a" and "netstat -rn" would be useful.
Cheers
Martin
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de
From:
Joan Marc Riera <[email protected]>
To: Martin Knoblauch
<[email protected]>
Cc:
"[email protected]"
<[email protected]>
Sent: Sat, October
23, 2010 7:17:08 PM
Subject: Re:
[Ganglia-general] gmetad only reads from one node of each data_source
Hi,
I have restarted all, for sure.
This are the ouputs from the telnet:
node01: http://paste.ubuntu.com/518811/
node02: http://paste.ubuntu.com/518812/
I've done the following to get some output.
on node1 launch:(/usr/sbin/gmond --debug=10 2>&1 ) >
/hpcdrive/homemarc.riera/node01.gmond.debug
this is the complete output:
http://paste.ubuntu.com/518824/
on node02 launch: (/usr/sbin/gmond --debug=10 2>&1 )
> /hpcdrive/homemarc.riera/node02.gmond.debug
this is the complete output:
http://paste.ubuntu.com/518825/
restart gmetad on ganglia server.
Ctrl- C on node01
ctrl-c on node02
I've seen both logs and still don't get whats wrong. shame on me.
Meaningwhile, Ron, another user on the list suggested me to change
something on my gmond.conf
udp_recv_channel {
family = inet4
port = 8649
}
I've tryied, without success. maybe something else should be changed.
}
On 10/22/2010 02:27 PM, Martin Knoblauch wrote:
Hi
Marc,
on first sight, the configs for node01 and node02 look identical and
correct. Have the "gmonds" on all nodes been restarted after the
changes (just to be sure :-). What do you get from: "telnet node01
8649" and "telnet node02 8649"?
Oh, which version of gmetad/gmond are you running?
Cheers
Martin
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de
From:
Joan Marc Riera <[email protected]>
To: Martin
Knoblauch
<[email protected]>
Cc:
"[email protected]"
<[email protected]>
Sent: Fri,
October
22, 2010 12:51:55 PM
Subject: Re:
[Ganglia-general] gmetad only reads from one node of each data_source
Sorry, I think my response has been discarted because of the
attachments. I send it again with my conf files on pastebin. Sorry to
bother.
My gmond conf has only minor changes. I'm happy to share them .
I link(pastebin) to 3 files, gmond from node01 , node02 and nodegpu01.
node01: http://pastebin.com/wa9mmT3h
node02: http://pastebin.com/ZtwsqnNp
nodegpu01 : http://pastebin.com/3ztHULwd
As I remember, the only changes I had done are name and owner depending
on the Cluster group, and the upd send and recv channel to be different
for each Cluster group.
Thanks.
On 10/22/2010 12:30 PM, Martin Knoblauch wrote:
Hi
Joan,
what you describe sounds fine with regard to "gmetad". "gmetad" will
only talk one node per data_source. If that node fails and you have
more than one node listed, it will [try to] failover to the next
available node. So far, everything is working as expected.
Your problem is that apparently each of node01..10 only "knows" its
own metrics. Nodes listed on the data_source line need to know the
metrics of all nodes in the respective cluster. So it is more a problem
with the configuration of your "gmond" services. Care to share the
configuration of one of the nodes?
Cheers
Martin
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de
From:
Joan Marc Riera <[email protected]>
To: [email protected]
Sent: Fri,
October
22, 2010 11:50:05 AM
Subject:
[Ganglia-general] gmetad only reads from one node of each data_source
Hello,
I have gmetad with following conf running :
r...@fbmsgga01:/var/lib/ganglia# cat /etc/ganglia/gmetad.conf |grep -v ^# |grep -v ^$
data_source "CPU cluster" node01 node02 node03 node04 node05 node06 node07 node08 node09 node10
data_source "GPU cluster" nodegpu01
gridname "FBM"
r...@fbmsgga01:/var/lib/ganglia#
All nodes and gmetad server are on the same vlan.
I only recieve nodegpu01 and node01 info, but if I stop gmond on node01
I start receiving from node02. If I stop node02 I start receiving from
node03, and so on.
I do not understant what is happening, everithing was working fine
until yesterday, when I restarted gmetad host.
data from nodegpu01 is being received and plotted fine.
What is going on here?
Thanks.
Marc
--
Joan Marc Riera Duocastella
Barcelona Media - Centre d'Innovació
Av. Diagonal, 177, planta 9 08018 - BARCELONA
Telèfon +34 93 238 14 00 Fax +34 93 309 31 88
www.barcelonamedia.org
--
Joan Marc Riera Duocastella
Barcelona Media - Centre d'Innovació
Av. Diagonal, 177, planta 9 08018 - BARCELONA
Telèfon +34 93 238 14 00 Fax +34 93 309 31 88
www.barcelonamedia.org
--
Joan Marc Riera Duocastella
Barcelona Media - Centre d'Innovació
Av. Diagonal, 177, planta 9 08018 - BARCELONA
Telèfon +34 93 238 14 00 Fax +34 93 309 31 88
www.barcelonamedia.org
--
Joan Marc Riera Duocastella
Barcelona Media - Centre d'Innovació
Av. Diagonal, 177, planta 9 08018 - BARCELONA
Telèfon +34 93 238 14 00 Fax +34 93 309 31 88
www.barcelonamedia.org
|