I'm trying to revamp an existing Ganglia setup, hopefully being more organized
and open for future expansion about it, but am running into an issue.
Here's the setup (all nodes running 3.0.7):
"webby" is the machine with a gmetad and the web frontend.
"headnode1" is the headnode of cluster #1, also with gmetad.
"n01" ... "n03" are nodes in cluster #1, on a private network, each with
gmond.
(Looking ahead, I have another cluster I'd like to add to this grid, and
possibly more.)
I set up the "gmond.conf" files on the nodes by creating a default conf file
and then editing it, setting the cluster name.
Started 'gmond' on the three nodes, and 'telnet localhost 8649' produces the
XML that I expect to see: stats from all three nodes, wrapped in a <CLUSTER>
tag with the name I set.
So far, so good.
"headnode1" gets "gmetad" with a "data_source" line like this:
data_source "grAPHics" n01.sub.domain.edu
and one more modification from the default, adding the FQDN of "webby" in the
"trusted_hosts" section:
trusted_hosts webby.domain.edu
If I type "telnet n01 8649" from the "headnode1" command line I see the XML
stream I expect to see, and saw when I typed the same command on "n01."
I'm also seeing the collected stats in /var/lib/ganglia/rrds.
Issue #1: running "gmetad -d10" on "headnode1" gives me this message:
Going to run as user ganglia
Sources are ...
Source: [grAPHics, step 15] has 1 sources
10.0.25.1
tcp_listen() on xml_port failed: Address already in use
On to "webby," which I'm *hoping* will serve up the graphed statistics for not
only this cluster but other clusters as well.
Its "gmetad.conf" file has a similar "data_source" line:
data_source "grAPHics" headnode1.domain.edu
The web frontend "conf.php" has the "$ganglia_ip" and "$ganglia_port" values
set, but that's definitely not right.
Here, we have bigger problems / issues:
- running "gmetad -d10" yields the following:
Going to run as user ganglia
Sources are ...
Source: [grAPHics, step 15] has 1 sources
aaa.bbb.ccc.ddd <-- the correct IP address for "headnode1"
tcp_listen() on xml_port failed: Address already in use
- can't telnet to "headnode1" on 8649 / 8650 / 8651 to retrieve XML
- /var/log/messages contains lots of these:
<date> <time> webby /usr/sbin/gmetad[29411]: data_thread() got no answer
from any [grAPHics] datasource
- the web frontend, such as it is, displays no graphs.
<http://fusion.cs.washington.edu/ganglia/?m=&r=hour&s=descending&hc=4>
('fusion.cs' is "webby")
So.
I've got questions:
- do I need to run a webserver on 'headnode1' so that 'webby' can retrieve
XML from it?
- do I have the 'gmond.conf' files set up correctly on the nodes?
- do I have the 'gmetad.conf' files set up correctlyon the head node and on
the web frontend?
- how do the 'gmetad' processes communicate with each other?
Can someone please help me?
Thank you.
--
Stephen Spencer | [email protected] | 206-616-3281
Graphics System Engineer, UW Computer Science Department
Chair, ACM SIGGRAPH Publications Committee
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general