This seems like a common pattern in multi-threaded applications:

1) Single thread reads of the network, allocating buffers as necessary.
2) It sends these buffers along with other info to worker threads that 
actually do real work. A worker thread processes the event entry and then 
frees the buffers from step (1). Typically a ring buffer or some other kind 
of queue is used with an entry that resembles something like this:

struct event_entry {
char* buffer;
int length;
// Other stuff.
}

The problem with this pattern is that memory allocated on one thread is 
freed on another. Besides the constant allocation and frees, this also 
suffers from the allocation and free actually happening on different 
threads causing contention.

I need to use a pointer to a buffer instead of an inlined array since the 
size of data is for each entry is variable and unknown. Currently this is 
what I am planning to do to handle this:

1) Producer maintains a pool of buffers of different sizes. I could also 
just use a memory allocator like jemalloc which does pooling behind the 
scenes.
2) Producer picks an appropriately sized buffer from the pool for an 
incoming request. It's easy to pick a size especially if the protocol is 
length prefixed.
3) Once there are enough bytes to form a complete entry (based on our 
protocol) the producer puts a pointer to this buffer on a ring buffer entry 
and publishes it.
4) The consumer picks an entry off the ring buffer and synchronously 
processes it. If it needs the buffer entry beyond the point of initial 
processing it copies it (this is rare for me). It marks the ring buffer 
entry as processed. 
5) This ties to step (2). Whenever the producer doesn't have the right 
sized buffer in it's pool it checks all the ring buffer entries already 
marked processed by the consumer in this cycle of the ring buffer. It does 
so by checking the sequence number of the consumer/worker. It claims all of 
these buffers as processed and puts them back in its pool. This logic needs 
to be run at least once per cycle of the ring buffer (it could be triggered 
early because there was a shortage of buffers) otherwise we will end up 
reusing buffers that are still being processed. If after a reclamation it 
still cannot find a right sized buffer it just allocates one and adds it to 
the pool (should be rare in steady state).

Any comments or obvious faults with my logic? What do you guys use to 
exchange buffers of dynamic sizes between two threads? I am trying to avoid 
cross thread allocate and free. In fact I am trying to avoid allocate and 
free in steady state all together.

Thanks!

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Scalable Synchronization Algorithms" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/lock-free/faef11c0-f50a-4195-a469-d079ec23175a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to