matt massie wrote:
Today, Steven Wagner wrote forth saying...


So, looking at this here example plugin (and writing one of my own, the aforementioned universal uname plugin), I'm wondering if it's part of The Plan for the plugin to determine when to publish its metrics. I notice that g3_job_t has a reference to a collect function *and* a publish function, but I don't remember seeing anyplace in the scheduler where it decides on when to *publish* metrics/jobs.


the "collect" function has the responsibility of calling the "publish" function whenever it deems necessary. if you look inside ./plugins/test-plugin.c you see how the "collect" function checks to see if it should "publish" the metric. i separated the "collect" side of things from the "publish" side of things for two reasons. 1. both functions operated on the same data job->data and 2. i needed a way for a job to specify what to do when data is "pulled". when a client pulls data from a host, the queue is walked (with read locking) and the "publish" function is called on each queue item (job). the "publish" function takes an arg which most likely will be a handle to a multicast channel, database, client socket, etc.

make sense?

Well, from a plugin developer's point of view it's probably good to know exactly when the publish function's being called. So I was trying to determine if it was *ONLY* being called from within the collect() or whether it could be called by a front-end other other circumstances (a multicast threshold being reached, for example).

I've already torn test-plugin asunder, it's how that fits into the rest of things that I'm wondering about now (I thought that it would be helpful for me to approach this from the standpoint of a plug-in developer and offer suggestions from that perspective).

btw, there is a rwlock on the queue (for queue inserts, extracts, etc). and each job has a rwlock (to protect the data). this makes it possible to have multiple backends and frontends operate on the same queue without trippinng over each other too much.

Meaning non-reentrant function calls from within collect() or publish() don't have to have any additional locking? ... neat.

[seriously, I would again like to suggest what has probably already been considered, which is to make an XML_import plug-in and a RRD front-end ... which would obviate the need for a separate/revised gmetad and hopefully cut down on configuration questions ... ]

[ that also reminds me, has any consideration been given to end-user configuration of a plug-in? is there any API-level way of passing run-time configuration directives to a plug-in, or invoking multiple instances of a plug-in with different configurations? or is this left to the individual plug-in? ]

(since it seems one job may handle multiple metrics, we might want to start making that differentiation...)


yep. one job can handle multiple metrics (e.g. 1,5,15-min load). the data for the job (job->data) in this case will be an array of values and when "publish" is called it will likely publish all metrics in the array.

Gee Davey, guess I should have thought of that. :)



Speaking of which, I am having some trouble writing a plugin that does exactly that, because I am a chimp.


yeh. and you have so much rich documentation to work with.. i don't see why you can't get it. :P

Nah, I got it to work with one metric right away. Getting it to work with three was tough, see above.

the position in the metric tree will be determined by how the "publish" function publishes the data. it will format the message so that the listening module (not built yet) will know how to interpret the message. i've thought about have a "serialize" and "deserialize" function in the job structure (serialize would replace "publish"). actually .. i think that's the left way to do it. serialize will take the job->data structure and structure it for the wire. deserialize will take a wire message and convert it into a job->data structure.

How is it the listening function's responsibility to classify a plugin? It seems more reasonable to do it with some sort of combination of plugin metadata and some fields in the job structure that are set when the plugin initializes - determining its instance, the number of metrics it's polling and what it'll be naming them, etc.

Concrete example time:

You're writing a df-type plugin called "storage" (maybe it runs df and parses the output, maybe it's fancy. We don't care!), and want to support several types of data per partition. The concept should support something as complicated as this:

<HOST NAME="zim">
    <CLASS NAME="storage">
         <GROUP NAME="hda0">
              <METRIC NAME="storage-avail" VALUE="..." UNITS="..."/>
              <METRIC NAME="storage-total" VALUE="..." ... />
              <METRIC NAME="percent-used" VALUE="..." ... />
         </GROUP>
         <GROUP NAME="hda1">
                ...
         </GROUP>
         [other groups]
    </CLASS>
    [other classes]
</HOST>

Between allowing multiple (differently-configured) plugin instances and a two-dimensional metric space per publish(), people should be able to get the results they need in most custom cases.

i think so too.  we're close.

Well, I'm just talking about a uname plugin. :)


Reply via email to