Hi Martin,

I guess MC stands for Multicast, and UC for unicast.

a ) Each node has only one network interface

NODE09:/home/admmarc# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
172.16.33.0     0.0.0.0         255.255.255.0   U         0 0          0 eth0
0.0.0.0         172.16.33.1     0.0.0.0         UG        0 0          0 eth0


I've solved my situation by disabling MC on the switch, in this way each MC stream becomes broadcast and reaches the ganglia server. It's not nice, but it works and I'm on a hurry.

In this actual situation I'm with two different gmond.conf files, on for each node01-10 and another for nodegpu01. Being node01-10 for a cluster and nodegpu01 a cluster of one ^_^.

I know it's not the best way to make it work because is flooding all the network with useless broadcast, but since I haven't managed to set UC config files I will leave the system as is. I'm following the rule "it works, don't touch" :-) 

Anyway, if someone would like to share his gmond.conf and gmetad.conf for a single cluster on a single network working on UC mode will be very nice.


Thanks to all .


Marc








On 10/25/2010 04:02 PM, Martin Knoblauch wrote:
Hi Marc,

 the output of telnet seems to indicate that your "gmond"s indeed only see their own data. Kind of strange. I have to admit that I have not used MC configurations for quite some time. UC is so much cleaner in my opinion. Questions:

a) how many network interfaces do the "nodes"s have?
b) if more than one, to which interface is the MC address bound? If not the first, you may want to play with "mcast_if".

 Output if "ifconfig -a" and "netstat -rn" would be useful.
 
Cheers
Martin
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de


From: Joan Marc Riera <[email protected]>
To: Martin Knoblauch <[email protected]>
Cc: "[email protected]" <[email protected]>
Sent: Sat, October 23, 2010 7:17:08 PM
Subject: Re: [Ganglia-general] gmetad only reads from one node of each data_source

Hi,

I have restarted all, for sure.

This are the ouputs from the telnet:
node01:  http://paste.ubuntu.com/518811/
node02: http://paste.ubuntu.com/518812/


I've done the following to get some output.
on node1 launch:(/usr/sbin/gmond --debug=10 2>&1 ) > /hpcdrive/homemarc.riera/node01.gmond.debug
this is the complete output: http://paste.ubuntu.com/518824/
on node02 launch: (/usr/sbin/gmond --debug=10 2>&1 ) > /hpcdrive/homemarc.riera/node02.gmond.debug
this is the complete output: http://paste.ubuntu.com/518825/
restart gmetad on ganglia server.
Ctrl- C on node01
ctrl-c on node02







I've seen both logs and still don't get whats wrong. shame on me.

Meaningwhile, Ron, another user on the list suggested me to change something on my gmond.conf
udp_recv_channel { 
  family = inet4
  port = 8649 
}

I've tryied, without success. maybe something else should be changed. 





} 
    




On 10/22/2010 02:27 PM, Martin Knoblauch wrote:
Hi Marc,

 on first sight, the configs for node01 and node02 look identical and correct. Have  the "gmonds" on all nodes been restarted after the changes (just to be sure :-). What do you get from: "telnet node01 8649" and "telnet node02 8649"?

 Oh, which version of gmetad/gmond are you running?

Cheers
Martin
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de


From: Joan Marc Riera <[email protected]>
To: Martin Knoblauch <[email protected]>
Cc: "[email protected]" <[email protected]>
Sent: Fri, October 22, 2010 12:51:55 PM
Subject: Re: [Ganglia-general] gmetad only reads from one node of each data_source

Sorry, I think my response has been discarted because of the attachments. I send it again with my conf files on pastebin. Sorry to bother.

My gmond conf has only minor changes. I'm happy to share them .

I link(pastebin) to 3 files, gmond from node01 , node02 and nodegpu01.
node01: http://pastebin.com/wa9mmT3h
node02: http://pastebin.com/ZtwsqnNp
nodegpu01 :http://pastebin.com/3ztHULwd


As I remember, the only changes I had done are name and owner depending on the Cluster group, and the upd send and recv channel to be different for each Cluster group.


Thanks.

On 10/22/2010 12:30 PM, Martin Knoblauch wrote:
Hi Joan,

 what you describe sounds fine with regard to "gmetad". "gmetad" will only talk one node per data_source. If that node fails and you have more than one node listed, it will [try to] failover to the next available node. So far, everything is working as expected.

 Your problem is that apparently each of node01..10 only "knows" its own metrics. Nodes listed on the data_source line need to know the metrics of all nodes in the respective cluster. So it is more a problem with the configuration of your "gmond" services. Care to share the configuration of one of the nodes?

Cheers
Martin
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de


From: Joan Marc Riera <[email protected]>
To: [email protected]
Sent: Fri, October 22, 2010 11:50:05 AM
Subject: [Ganglia-general] gmetad only reads from one node of each data_source

Hello,

I have gmetad with following conf running :
r...@fbmsgga01:/var/lib/ganglia# cat /etc/ganglia/gmetad.conf |grep -v ^# |grep -v ^$
data_source "CPU cluster" node01 node02 node03 node04 node05 node06 node07 node08 node09 node10
data_source "GPU cluster" nodegpu01
 gridname "FBM"
r...@fbmsgga01:/var/lib/ganglia#

    

All nodes and gmetad server are on the same vlan.

I only recieve nodegpu01 and node01 info, but if I stop gmond on node01 I start receiving from node02. If I stop node02 I start receiving from node03, and so on.

I do not understant what is happening, everithing was working fine until yesterday, when I restarted gmetad host.

data from nodegpu01 is being received and plotted fine.


What is going on here?


Thanks.

Marc

--
Fundació Barcelona Media
Joan Marc Riera Duocastella
Barcelona Media - Centre d'Innovació
Av. Diagonal, 177, planta 9 08018 - BARCELONA
Telèfon +34 93 238 14 00 Fax +34 93 309 31 88
www.barcelonamedia.org

--
Fundació Barcelona Media
Joan Marc Riera Duocastella
Barcelona Media - Centre d'Innovació
Av. Diagonal, 177, planta 9 08018 - BARCELONA
Telèfon +34 93 238 14 00 Fax +34 93 309 31 88
www.barcelonamedia.org

--
Fundació Barcelona Media
Joan Marc Riera Duocastella
Barcelona Media - Centre d'Innovació
Av. Diagonal, 177, planta 9 08018 - BARCELONA
Telèfon +34 93 238 14 00 Fax +34 93 309 31 88
www.barcelonamedia.org

--
Fundació Barcelona Media
Joan Marc Riera Duocastella
Barcelona Media - Centre d'Innovació
Av. Diagonal, 177, planta 9 08018 - BARCELONA
Telèfon +34 93 238 14 00 Fax +34 93 309 31 88
www.barcelonamedia.org
------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to