Re: [Ganglia-developers] hierarchical metric naming (long)

Federico Sacerdoti Tue, 03 Sep 2002 13:59:19 -0700

I like your comments. Here are some counter-questions :)



On Friday, August 30, 2002, at 02:34 PM, Steven Wagner wrote:

It seems to me this would also make the "DSO-ification" of themonitoring core a smoother process, not to mention a cleaner one fromthe standpoint of those developing the DSO's. :)


Good point.

I was thinking of "yet another hash" that has a hashed-up number basedon the name or hierarchy position of the metric as a key. The ideabeing, this number is shorter than using the fully-qualified name ofthe metric all the time.
So instead of encoding "cpu.idle" we encode 0x03FA450A and that field's50% shorter (even better if we get to"processes.top.1.cpu_percentage"), and only have to multicast the realstring name once. The hierarchical information is stored (as apointer, at the very least) in this hash.
What's really going to be key here is not so much the idea of making the
statically-#define'd metric hash dynamic, but keeping it up to date...
If we go far enough in this it'll look like SNMP, only morecollaborative. :)

So I am thinking that sending the fully-qualified metric name (as shownabove) is a better idea now - it handles failures more effectively. Whena node comes up it would receive metrics that look like"host1/cpu/cache/size" (fully-qualified with all the metric's ancestors)instead of "cache/size" (relative as I had suggested previously). Thisfits in with Steven's idea of hosts being authoritative for branchesthey created - each metric specifies its branches explicitly. It alsoreduces reliance on an elder node for the branch hierarchy.

This way a node can easily create branches as needed for any metric itreceives.

About the "hash for storing fully qualified metric names (FQMN :)". Howwould we populate such a hash? At some level, the metric must specifyits fully-qualified name, so we know where to put it. A hash value is nogood if we don't already have the name stored. How would you handle newmetrics? I think we could runlength-encode the name strings to savespace if we need to, but having each metric carry its full name seemsclearer to me.

I imagine a hash_find(node, "cpu", "cache") function that takes avariable number of arguments to locate the hash table to insert a givenmetric in (the metric here: host1/cpu/cache/size). The 'node' argumentspecifies the root of the metric tree - the node hash table for host1.Note each branch would get it's own hash table so that hash_foreach()will work correctly and printing the XML will be easy.

To make this work, we simply add a 'hash_t *branch' member to themetric_data_t structure. If branch==NULL then we are a leaf (actualmetric), else this is a branch that points to another hash table. I canvisualize the XML output code now...

Dense, yes, but the area of metrics is just about the only one in theGanglia design that *doesn't* scale well (kudos, Matt & co.). I'm surethat we can work this out if we just keep banging those rockstogether. :)


Clever ;)

Do people like the java-like dot notation for hierarchical names, like"host1.cpu.cache.size", or the unix filesystem forward-slash notation:"host1/cpu/cache/size". I like the slashes because its easy to tell ifyou're talking about a leaf or a branch: "host1/cpu/" is clearly abranch, while "host1.cpu." is a little harder to read. But either waywould work.

I'm psyched about this change, and I am ready to dive in right after the2.5.0 release.


-Federico

Rocks Cluster Group, Camp X-Ray, SDSC, San Diego
GPG Fingerprint: 3C5E 47E7 BDF8 C14E ED92  92BB BA86 B2E6 0390 8845

Re: [Ganglia-developers] hierarchical metric naming (long)

Reply via email to