The things you find when cleaning out your inbox... it's been over a month, so 
this response may be moot, but I'm replying anyway for posterity's sake. For 
those following along at home, the questions below are referring to 
https://github.com/nemothekid/heka-greedy-filter-deadlock/.

On 08/04/2014 09:07 PM, Tom Sawyer wrote:
Hi, Rob and Nimi.
I have watched this thread this days. I have several questions about the
deadlock situation of GreedyFilter.

1. Which channel is fully consumed and get frozen when deadlock occurs?
I thought it was router.InChan and injectRecycleChan because Inject
called by Input Plugin will deliver the pack into forunner.InChan while
at the same time Filter Inject pack into forunner.InChan too.
  forunner.InChan maybe fully occupied because GreedyFilter has no
chance take pack out of forunner.InChan when it is flushing. But
Inject() of a filter will run a goroutine to do acctually inject
operation and then GreedyFilter.flush() will return and get pack from
forunner.InChan eventually. So I don't know where the deadlock happens.
Pretty sure it's the router that's getting backed up here. We have a situation 
where messages that are intended for GreedyFilter are coming in very quickly. 
This is fine, as long as GreedyFilter is processing them quickly. But 
GreedyFilter's Flush method initializes and injects 10,000 messages, which 
takes a very long time relative to most operations. While this is happening, 
the incoming messages slowly pile up until the router is backed up and can't 
accept any more messages. But GreedyFilter itself is *also* trying to inject 
new messages back into the router. Since the router is blocked, GreedyFilter 
gets stuck in the Flush method, it never returns to the Run method to start 
clearing away the backlog, and we have a deadlock.
2. When deadlock happens why can  "everybody calls Recycle or Inject"
  and the pool size is still exhausted?
Because the issue isn't that the message pool is exhausted, but that the router 
is blocked because GreedyFilter isn't processing messages, and GreedyFilter 
isn't processing messages because the router is blocked.

Any filter or output that is going to be receiving a high volume of messages 
should be doing what it can to process those messages as quickly as possible, 
with a minimum of pauses. With outputs, there's a bit more tolerance for 
pauses; the router may back up momentarily, but the backpressure will alleviate 
when the output starts processing the incoming messages again. Filters, 
however, are usually reinjecting messages back to the router, so if they cause 
a router backup then it might be unrecoverable.

The solution in this case was to move the processing out of a filter and into a 
decoder, which is where the intended functionality should have been in the 
first place. Any time you have a single input message generating multiple 
output messages it's probably something that a decoder should be doing instead 
of a filter. Slow operations in a decoder only slow down the input that is 
feeding the decoder; the rest of Heka will keep churning along undisturbed. If 
the work really needs to be done in a filter, the solution would be to have the 
filter spin up two goroutines so it can continue to process incoming messages 
while concurrently performing the flush operation (or whatever it is that's 
taking a long time).

Hope this helps clarify!

-r

_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to