Re: [Ganglia-developers] Ganglia Python DSO extensions

Brad Nicholes Mon, 14 Jan 2008 08:09:06 -0800

>>> On 1/12/2008 at 11:00 AM, in message
<[EMAIL PROTECTED]>, Matthias
Blankenhaus <[EMAIL PROTECTED]> wrote:
> Hi Brad !
> 
> I started looking into the impl of a IPMI Python DSO.  Since one of the 
> big advantages of IPMI is out-of-band monitoring, I would need a function 
> that returns the list of nodes reporting to a gmond instance.  I am 
> willing to extend the DSO stuff, if you could please give me a pointer.
>


If I understand correctly what you are looking for, you need a function that 
will return a list of all of the other gmond nodes that have reported metrics.  
The only problem is that a function like this will only work if all of the 
gmond nodes are reporting their metrics via a multicast connection.  If the 
gmond nodes are reporting in a hierarchical manner where all nodes report to a 
single controlling node via a unicast connection, the only node that has the 
information about all of the other nodes is the single controlling node.  The 
other problem is that the hosts data is declared and stored within the gmond 
binary itself and the only what to get at the hosts data is through a gmond 
external API.  Currently gmond doesn't have any external APIs that can be 
called directly from a module or anything else.  All external calls are in 
libganglia.  Gmond only makes calls out to other modules, nothing makes any 
calls directly back into gmond.  One way to get around this would be t
 o add another point in the mmodule_struct in metric.h that holds a readonly 
list of hosts.  The list of hosts would have to be maintained by gmond when a 
new host is detected.  The a modules main interface with gmond is through the 
mmodule_struct, it will also hold the pointer to the hosts list and could just 
read it whenever it needs to.  Module access to the configuration file data was 
done similar to this.  In other words, gmond stuffed the pointer to the 
configuration file handle into the mmodule_struct so that modules could read 
the configuration file directly.

Anyway, you should be able to get the information you  need by walking the 
'hosts' apr_hash_t global variable.  This hash table stores a collection of 
'Ganglia_host' structures which is defined at the top of gmond.c.  There are a 
couple of examples of how to walk the hash table in 
process_tcp_accept_channel() and cleanup_data() functions.  The easiest thing 
to do would probably be to create a function that walks the 'hosts' hash table 
and creates an apr_array_t.  If a pointer to the apr_array_t that contains the 
hosts is stuffed into the mmodule_struct when a module is loaded, then whenever 
the apr_array_t is updated, all modules will have access to the information.  I 
don't necessarily like changing the mmodule_struct structure whenever things 
like this come up because that structure will have to be locked down when we 
ship 3.1.0.  But since we haven't shipped 3.1.0 yet, the structure can still be 
considered flexible.

The only problem from this point is how to expose the same host data through to 
a python module.  Except for a return value, the communication from gmond 
through mod_python to a python module, is one way.  Python modules don't have 
any way of calling back into gmond for additional information.  The 
mmodule_struct is not exposed directly to a python module either so the same 
trick of passing a common pointer through a structure won't work.  The only way 
to do it would be to add another parameter to the handler call when mod_python 
calls into the python module.  But this seems kind of messy because 99% of the 
time for other python modules, that parameter will complete unnecessary.  
Bottomline is, I'm not sure what a good solution is for this whole thing.


> Also, I think it would be useful if the DSO module could unload itself.
> As I understand all modules that are under 
> /usr/lib/ganglia/python_modules/ are automatically loaded.  However, if 
> e.g. the IPMI module determines that it can't function because of missing
> or unfitting (wrong version) SW pieces, then it should unload itself.
> 
> What say you ?
> 

I think that providing a python module with the ability to unload itself is a 
great idea.  Basically the metric_init function would need to return some kind 
of indications that it wants to be unloaded.  Since the metric_init function 
returns a list of dictionaries that contain the metric definitions that the 
module provides, probably the best way to indicate that it wants to be unloaded 
would be to return an empty list.  If mod_python detects an empty list, then it 
would just unload the module and continue on.  I could see this being useful by 
allowing the python module to detect if any of the metrics that it provides 
have been referenced in the configuration file.  If not then just unload 
itself. 

Let me know if you have any more questions,

Brad



-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Re: [Ganglia-developers] Ganglia Python DSO extensions

Reply via email to