>>> On 5/24/2009 at 12:43 AM, in message
<dcccdf790905232343y76481e5dw6c1df62bc732c...@mail.gmail.com>, David Birdsong
<david.birds...@gmail.com> wrote:
> I have a python module that spawns a separate thread that collects
> data off of a pipe.
> 
> Everything runs fine, but I'm finding that metric_cleanup is never
> called.  When I strace the PID of the worker thread(in Linux so it
> get's it's own PID), I see it gets a SIGTERM when I stop gmond instead
> of exiting under it's own power.  All of the gmond processes exit, but
> a  subprocess of my worker thread just ends up being reparented to
> init instead of being cleaned up by my metric_cleanup logic.
> 
> The worker thread reads from an endless pipe using a select.poll with
> a timeout, so the pipe shouldn't block.  I need to know to kill the
> process on the other end of the pipe which is what metric_cleanup
> should be providing.
> 
> I even removed all cleanup code from metric_cleanup() and just put an
> open('/tmp/ganglia_kill', 'w'),...but no file is created.  What can I
> investigate to understand why it's being ignored?
> 

Gmond depends on the APR memory pools for invoking the cleanups.  Basically the 
way it works is that when gmond starts up, it creates an APR memory pool.  This 
memory pool is used to allocate and manage memory of everything in gmond that 
deals with APR.  One of the features of APR memory pools is that I can tie 
functions to a pool that get invoked with the memory pool is cleaned up.  In 
this case in the function setup_metric_callbacks() in gmond.c, it is tying all 
of the module cleanup functions to the main global memory pool.  When gmond 
exits, the last thing that happens is that the memory pools that were created, 
are destroyed.  This should trigger all of the cleanup routines.  To debug 
this, you will need a debug version of APR and set a break point in 
apr_terminate().  Also set a break point in apr_pool_destroy.  These functions 
should be getting called automatically when the gmond process shuts down.  
Another quick workaround would be to explicitly call apr_pool_destroy 
(global_context) as the last statement in main.c.  This will force the 
destruction of the global memory pool which should also cause the clean up 
routines to be called.

Brad 


------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian
Group, R/GA, & Big Spaceship. http://www.creativitycat.com 
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to