Trying to iterate over worker threads that may be starting and stopping seemed 
just too darn hard (for me anyway).  Reading the "scoreboard" code put me off.  
 I ended up using atomic operations.  In the mod_sflow module this boils down 
to just 3 atomic operations in the critical path.  That's two calls to 
apr_atomic_inc32() to bump global counters and one apr_atomic_dec32() to see if 
the transaction should be sampled (see mod-sflow.googlecode.com).   This way 
the module doesn't know or care how many threads there are.

Although "apr_atomic_inc32(&counter)" is much more expensive than "counter++"  
my understanding is that this is still much better than having threads stall as 
they wait for a contested mutex.   Someone please tell me if I got that wrong(!)

For a truly global counter you still have to aggregate across multiple unix 
processes.  A shared memory segment might be the  best way to do that  but I 
wasn't sure if that was really going to work with atomic operations.   In any 
event I found it convenient to use a pipe.  If you open a pipe before the 
child-forking starts then every child process automatically sees the same pipe 
and they can all send messges up to the "master" who is still reading from the 
other end of it.   As long as those messages are less than PIPE_BUF in length 
(sometimes 512,  but usually 4096 bytes,  depending on the OS) then they are 
atomic and don't get interleaved.  This way the "master" doesn't  know or care 
how many child processes there are.

Neil


On Jun 30, 2011, at 2:31 AM, Massimo Manghi wrote:

> I found this problem conceptually interesting and worth of generalization in 
> many contexts. I'd like to know if Neil got around the problem with something 
> that could be a satisfying solution.
> 
> thanks
> 
>  -- Massimo
> 
> On 06/28/2011 04:57 AM, Neil McKee wrote:
>> Hi,
>> 
>> Here's an easy question for someone who knows their way around...
>> 
>> I want to maintain a new global counter,  but for performance reasons I am 
>> reluctant to use a mutex or atomic_increment to update it.  I would rather 
>> maintain a separate counter for every worker-thread,  and only accumulate 
>> the global counter when required.  (If the per-worker-thread counter is 
>> 32-bit then I shouldn't even need a mutex when accumulating the total across 
>> all the current threads).
>> 
>> Obviously I shouldn't just declare something as "__thread apr_int32_t 
>> mycounter;" and mince it together as a linux-only hack.  I'd like to find 
>> the portable apr-library way to do it.   So I think I need to find the 
>> following:
>> 
>> * - a hook that is called whenever a worker thread is started
>> * - a hook that is called whenever a worker thread is about to die
>> * - a hook to find_or_create a 32-bit integer that is private to the current 
>> worker-thread
>> * - a fn to iterate (safely) over all the current worker threads
>> 
>> It's the last one that seems particularly elusive.  I could't find an ap_ or 
>> apr_ library call that seemed to do anything like that.
>> 
>> If this has all been done before,   please can you point me to the relevant 
>> module sources?  I think it would save me a lot of time.   Alternatively,  
>> if you think I should just relax and use an atomic increment instead,  then 
>> let me know.
>> 
>> Thanks!
>> 
>> Neil
> 

Reply via email to