Inline :)

Brad

>>> On 1/14/2008 at 12:16 PM, in message
<[EMAIL PROTECTED]>, Matthias
Blankenhaus <[EMAIL PROTECTED]> wrote:
> Brad,
> 
> first of thank you very much for your elaborate reply !
> 
> Please, have a look at my comments inline blow.
> 
> Matthias
> 
> On Mon, 14 Jan 2008, Brad Nicholes wrote:
> 
>> >>> On 1/12/2008 at 11:00 AM, in message
>> <[EMAIL PROTECTED]>, Matthias
>> Blankenhaus <[EMAIL PROTECTED]> wrote:
>> > Hi Brad !
>> > 
>> > I started looking into the impl of a IPMI Python DSO.  Since one of the 
>> > big advantages of IPMI is out-of-band monitoring, I would need a function 
>> > that returns the list of nodes reporting to a gmond instance.  I am 
>> > willing to extend the DSO stuff, if you could please give me a pointer.
>> > 
>> 

[...]

>> 
>> Anyway, you should be able to get the information you  need by walking the 
> 'hosts' 
>> apr_hash_t global variable. 
> 
> What exactly is this APR stuff ?  I do understand your stmts about the 
> hash table.
> 

Apache Portable Runtime library.  The purpose of APR is to make binaries like 
Gmond easier to move from one platform to another by implementing the most 
common platform differences in a common way.  Gmond is built on APR and I wish 
Gmetad was as well (but that is a topic for another day).  There are several 
APIs like apr_hash_XXX() that provide a common way to do hash tables.  You just 
need to walk the hash tables using the APR APIs.  It is very straight forward 
and a lot simpler to do with APR than other hashing APIs that I have seen.

>> This hash table stores a collection of 'Ganglia_host' structures which is 
> defined 
>> at the top of gmond.c.  There are a couple of examples of how to walk 
>> the hash table in process_tcp_accept_channel() and cleanup_data() functions. 
>  
>> The easiest thing to do would probably be to create a function that 
>> walks the 'hosts' hash table and creates an apr_array_t.  If a pointer to 
> the apr_array_t 
>> that contains the hosts is stuffed into the mmodule_struct when a module 
>> is loaded, then whenever the apr_array_t is updated, all modules will 
>> have access to the information.
> 
> From below I understand that this however excludes momentarily Python 
> modules, right ?  Does this mean that C moduls can access it ? If so, 
> should I then rather implement this in C ?  fine with me :-)
> 

For now C would probably be easier.  If the hosts were exposed to a C module, 
it could be either through an API or through the mmodule_struct like I 
described.  However I am guess from comments that you had made before, that you 
might need to take advantage of the spoofing functionality here because it 
sounds like one gmond instance will be gathering IPMI information for all of 
the nodes.  If that is the case, then spoofing will probably have to be 
extended through modules as well.  Right now, gmetric is the only thing that 
knows how to create a spoof packet.  It wouldn't be hard to extend spoofing to 
a module, it just hasn't been done.  

In addition, the host hash table and associated struct should probably be moved 
from gmond to libganglia as well.  That way the API go get the hosts could be 
made external without having to start exposing external APIs directly from 
gmond.

>>  I don't necessarily like changing the 
>> mmodule_struct structure whenever things like this come up because that 
>> structure will have to be locked down when we ship 3.1.0.  But since we 
>> haven't shipped 3.1.0 yet, the structure can still be considered 
>> flexible.
> 
> I share your concern of changing that structure.  However, as you've 
> mentioned if there is a good time to do this it's now.
> 
>> 
>> The only problem from this point is how to expose the same host data through 
> 
>> to a python module.  Except for a return value, the communication from 
>> gmond through mod_python to a python module, is one way.  
> 
> Yeah, that's what I figured.
> 
>> Python modules don't have any way of calling back into gmond for additional 
> information.  
>> The mmodule_struct is not exposed directly to a python module either so 
>> the same trick of passing a common pointer through a structure won't 
>> work.  The only way to do it would be to add another parameter to the 
>> handler call when mod_python calls into the python module.  But this 
>> seems kind of messy because 99% of the time for other python modules, 
>> that parameter will complete unnecessary. 
> 
> You've mentioned above the one-way communication going from gmond to a 
> Python module.  How about we introduce another optional callback that 
> takes as a parameter the host list, thus picking up your idea.  Then 
> whenver gmond updates the host list in mmodule_struct it calls this 
> optional method on all python modules ?  Of course, if we would some
> day end up with > 100 modules that might become a performance problem.
> However, the callbacks could be invoked asynchronously from a dedicated
> gmond thread.
> 

I would like this idea better if the optional callback could be made more 
generic.  In other words, if there was a set of data that a python module could 
get from gmond and the module could simply register for what it wanted.  Then 
the extra data could be added as optional parameters to the metric_handler 
callback rather than an additional callback.  I'll have to think about this 
some more.

>> Bottomline is, I'm not sure 
>> what a good solution is for this whole thing.
>> 
>> 
>> > Also, I think it would be useful if the DSO module could unload itself.
>> > As I understand all modules that are under 
>> > /usr/lib/ganglia/python_modules/ are automatically loaded.  However, if 
>> > e.g. the IPMI module determines that it can't function because of missing
>> > or unfitting (wrong version) SW pieces, then it should unload itself.
>> > 
>> > What say you ?
>> > 
>> 
>> I think that providing a python module with the ability to unload itself is 
> a great idea.  
>> Basically the metric_init function would need to return some kind of 
>> indications that it wants to be unloaded.  Since the metric_init function 
>> returns a list of dictionaries that contain the metric definitions that 
>> the module provides, probably the best way to indicate that it wants to 
>> be unloaded would be to return an empty list.  If mod_python detects an 
>> empty list, then it would just unload the module and continue on.  I 
>> could see this being useful by allowing the python module to detect if 
>> any of the metrics that it provides have been referenced in the 
>> configuration file.  If not then just unload itself. 
> 
> I like this implementation.  Do we need to allocate a NULL parameter set 
> for modules that do not expose metrics and only implement a side effect ?
> Is this every conceivable ?
> 

Maybe, I haven't really thought about that.

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to