Sink Behavior Standardization

Brock Noland Sun, 26 Feb 2012 18:09:49 -0800

Good discussion on some of the reasons we should standardize this on:

https://issues.apache.org/jira/browse/FLUME-998


Brock

On Sun, Feb 26, 2012 at 11:14 PM, Brock Noland <[email protected]> wrote:
> Hello,
>
> This might be something for the developer guide or it might be
> somewhere and I just missed it.  I feel like we should set down some
> expectations in regards to:
>
> 1) Source behavior when:
>  a) Channel put fails
>  b) Source started but is unable to obtain new events for some reason
> 2) Channel behavior when:
>  a) Channel capacity exceeded
>  b) take when channel is empty
> 3) Sink behavior when:
>  a) Channel take returns null
>  b) Sink cannot write to the downstream location
>
> This comes about when I noticed some inconsistencies.  For example, a
> take in MemoryChannel blocks for a few seconds by default and
> JDBCChannel does not (FLUME-998). Combined with HDFSEvent sink, this
> causes tremendous amounts of CPU consumption. Also, currently if HDFS
> is unavailable for a period, flume needs to be restarted (FLUME-985).
>
> My general thoughts are are based on experience working with JMS based 
> services.
>
> 1) Source/Channel/Sink should not require a restart when up or down
> stream services are restarted or become temporarily unavailable.
> 2) Channel capacity being exceeded should not lead to sources dying
> and thus requiring a flume restart. This will happen when downstream
> destinations slow down for various reasons.
>
> Brock
>
> --
> Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Re: Source/Channel/Sink Behavior Standardization

Reply via email to