matt massie wrote:
Today, Steven Wagner wrote forth saying...
So, looking at this here example plugin (and writing one of my own, the
aforementioned universal uname plugin), I'm wondering if it's part of The
Plan for the plugin to determine when to publish its metrics. I notice
that g3_job_t has a reference to a collect function *and* a publish
function, but I don't remember seeing anyplace in the scheduler where it
decides on when to *publish* metrics/jobs.
the "collect" function has the responsibility of calling the "publish"
function whenever it deems necessary. if you look inside
./plugins/test-plugin.c you see how the "collect" function checks to see
if it should "publish" the metric.
i separated the "collect" side of things from the "publish" side of things
for two reasons. 1. both functions operated on the same data job->data
and 2. i needed a way for a job to specify what to do when data is
"pulled". when a client pulls data from a host, the queue is walked (with
read locking) and the "publish" function is called on each queue item
(job). the "publish" function takes an arg which most likely will be a
handle to a multicast channel, database, client socket, etc.
make sense?
Well, from a plugin developer's point of view it's probably good to know
exactly when the publish function's being called. So I was trying to
determine if it was *ONLY* being called from within the collect() or
whether it could be called by a front-end other other circumstances (a
multicast threshold being reached, for example).
I've already torn test-plugin asunder, it's how that fits into the rest of
things that I'm wondering about now (I thought that it would be helpful for
me to approach this from the standpoint of a plug-in developer and offer
suggestions from that perspective).
btw, there is a rwlock on the queue (for queue inserts, extracts, etc).
and each job has a rwlock (to protect the data). this makes it possible
to have multiple backends and frontends operate on the same queue without
trippinng over each other too much.
Meaning non-reentrant function calls from within collect() or publish()
don't have to have any additional locking? ... neat.
[seriously, I would again like to suggest what has probably already been
considered, which is to make an XML_import plug-in and a RRD front-end ...
which would obviate the need for a separate/revised gmetad and hopefully
cut down on configuration questions ... ]
[ that also reminds me, has any consideration been given to end-user
configuration of a plug-in? is there any API-level way of passing run-time
configuration directives to a plug-in, or invoking multiple instances of a
plug-in with different configurations? or is this left to the individual
plug-in? ]
(since it seems one job may handle multiple metrics, we might want to start
making that differentiation...)
yep. one job can handle multiple metrics (e.g. 1,5,15-min load). the
data for the job (job->data) in this case will be an array of values and
when "publish" is called it will likely publish all metrics in the array.
Gee Davey, guess I should have thought of that. :)
Speaking of which, I am having some trouble writing a plugin that does
exactly that, because I am a chimp.
yeh. and you have so much rich documentation to work with.. i don't see
why you can't get it. :P
Nah, I got it to work with one metric right away. Getting it to work with
three was tough, see above.
the position in the metric tree will be determined by how the "publish"
function publishes the data. it will format the message so that the
listening module (not built yet) will know how to interpret the message.
i've thought about have a "serialize" and "deserialize" function in the
job structure (serialize would replace "publish"). actually .. i think
that's the left way to do it. serialize will take the job->data structure
and structure it for the wire. deserialize will take a wire message and
convert it into a job->data structure.
How is it the listening function's responsibility to classify a plugin? It
seems more reasonable to do it with some sort of combination of plugin
metadata and some fields in the job structure that are set when the plugin
initializes - determining its instance, the number of metrics it's polling
and what it'll be naming them, etc.
Concrete example time:
You're writing a df-type plugin called "storage" (maybe it runs df and
parses the output, maybe it's fancy. We don't care!), and want to support
several types of data per partition. The concept should support something
as complicated as this:
<HOST NAME="zim">
<CLASS NAME="storage">
<GROUP NAME="hda0">
<METRIC NAME="storage-avail" VALUE="..." UNITS="..."/>
<METRIC NAME="storage-total" VALUE="..." ... />
<METRIC NAME="percent-used" VALUE="..." ... />
</GROUP>
<GROUP NAME="hda1">
...
</GROUP>
[other groups]
</CLASS>
[other classes]
</HOST>
Between allowing multiple (differently-configured) plugin instances and a
two-dimensional metric space per publish(), people should be able to get
the results they need in most custom cases.
i think so too. we're close.
Well, I'm just talking about a uname plugin. :)