i was just thinking of this as i drove into work today. this is the problem. if you look at your xml, you will see the there is no <CLUSTER> tag. that is intentional but a problem for the current gmetad.

if you add the following to your gmond.conf, you will see the problem disappear.

cluster {
   name = "my cluster"
}

the 2.6.0 gmond will not wrap all the <HOST> tags inside of a <CLUSTER> tag unless you tell it to (with the directive above). the reason i'm doing that is that it gives you much more flexibility... unfortunately, gmetad currently expects and requires a <CLUSTER> tag.

i'll add the cluster attribute as part of the default and update gmetad to handle xml without a cluster tag.

as far as the gmetric side of things... it's not implemented yet. the 2.6.0 gmetric will read your gmond.conf and send data to all udp_send_channels specified.

-matt

Joshua Durham wrote:
I was just playing with this snapshot.. Is gmetric supposed to be working with this? I had some issues.. Here's the config:
/usr/local/ganglia jdurham$ grep -v "^#" etc/gmetad.conf

data_source "Test Cluster" 60 localhost

server_threads 10
rrd_rootdir "/usr/local/ganglia/rrds"

And when run:
/usr/local/ganglia jdurham$ sudo sbin/gmetad -c etc/gmetad.conf -d 10
Going to run as user nobody
Sources are ...
Source: [Test Cluster, step 60] has 1 sources
        127.0.0.1
xml listening on port 8651
interactive xml listening on port 8652
cleanup thread has been started
Data thread 25186304 is monitoring [Test Cluster] data source
        127.0.0.1
[Test Cluster] is an OLD version
Bus error



On Jan 19, 2005, at 12:36 PM, Matt Massie wrote:


i just uploaded a new snapshot of ganglia 2.6.0 to
http://matt-massie.com/ganglia/ganglia-2.6.0.200501191706.tar.gz

the only things left to do is
   - process gmetric messages
   - cleanup old hosts and metrics

i compiled and installed this snapshot on linux and windows (and each
host exchanged information via unicast udp).

i ran a test of this snapshot over night last night by starting gmond
with a full metric configuration and made ~13 million requests for data
on the xml port.  there were no xml errors.  no memory leaks (actually
2.6.0 will use considerably less memory than 2.5.x) and gmond only used
about 0.07% cpu to handle the requests.

i will update the documentation for gmond.conf soon (i didn't have time
today).  if you want to see the default gmond configuration for your
particular platform ... just run...

% ./gmond -t

with 2.6.0. every aspect of data collection and message sending is
tweakable.

if you want to see a list of all the metric supported by gmond run...

% ./gmond -m
load_one
mem_total
os_release
proc_run
load_five
gexec
...
cpu_num
cpu_speed
pkts_out
swap_free

i also added a new feature that was very simple to add but i think you
might find useful.  to see the total minimum bandwidth that a specific
configuration will generate run

% ./gmond --conf ./test.conf -b
7.545789 bytes/sec

it would be pretty easy to make this more elaborate in the future (such
as building an algorithm for handling the value_thresholds as well).
currently, the value is just a summation of all the metric message sizes
  divided by the time_threshold (i assume any string metrics are maximum
size).

a future feature that would be nice is to have a fixed bandwidth
restriction.

i also added a patch for linux that was submitted to
bugzilla.ganglia.info by Marcelo Matus that "gmond needs a small
modification to treat GFS as another network file system, ie, shuch as
NFS, SAMBA, etc.:"

-------

about the code for the new gmond.  i want to explain how things have
been simplified (i hope).

at line 1162 of gmond.c you'll see the setup_metric_callbacks()
function.  i've added the registration of all metric here.  it's
important to note that registering the metric here doesn't mean it is
collected but rather that it can be collected if the user asks for it in
the configuration file.

if you look inside ./gmond/conf.c you'll find the function
build_default_gmond_configuration() which build the default
configuration for gmond based on the platform.  this is the
configuration that will be used if no configuration file is specified
(with the --conf flag).  only the metric supported on a platform as
added to the default configuration.

last but not least, if you look in ./lib/protocol.x you'll see the
function for sending and receiving the metrics.  if you register a
metric for collection in setup_metric_callbacks() which isn't in
protocol.x you'll get an error message that "gmond doesn't know how to
send metric 'foo'".  the ./lib/protocol.x is NOT platform specific.
even if a metric is not implemented on a specific platform, it can still
be stored and reported... SOOOO... a linux box that received a solaris
"wcache" metric will not be confused at all by the message.  mixing
hosts from different platforms will not cause us any more problems.
mixing 2.5.x and 2.6.x is okay on all platforms but solaris (which will
be 80% right).

-matt
--
PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E F40B 242A 5984 ACBC 91D3'

    They that can give up essential liberty to obtain a little
       temporary safety deserve neither liberty nor safety.
   --Benjamin Franklin, Historical Review of Pennsylvania, 1759




-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

--
PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E F40B 242A 5984 ACBC 91D3'

   They that can give up essential liberty to obtain a little
      temporary safety deserve neither liberty nor safety.
  --Benjamin Franklin, Historical Review of Pennsylvania, 1759

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to