On Tue, Aug 16, 2011 at 10:59 AM, Felix Giguere Villegas < felix.gigu...@mate1inc.com> wrote:
> Maybe I'm missing something, but why don't you put your filtering Decorator > on the agent/source instead? > > What's the point of sending those events all the way to the CollectorSink > if they're going to be filtered out in the end? The only reason I can see is > if the only place where you can easily determine which events to filter out > IS at the end of the flow, but I can't think of a reason why that would be > the case... > > Good question - this might be worth investigating. The other reason I can think of is if you're forwarding data from your collector to multiple sinks -- e.g. both HDFS and Hbase, but perhaps filtering out some of the data for one or the other (we're not doing this, and I guess it doesn't work in E2E mode according to FLUME-165 anyway). I simplified our situation in my example -- we're really doing a light-weight ETL with an in-memory aggregation. So for each event that comes in, 0 or more events might come out of that event -- it's not just filtering, and it's not 1 event in, 1 event out. > I don't know the answer to your specific question (and I'd be curious to > find out as well), so I'm sorry if my comment doesn't help :) ... > > -- > Felix > > > > > On Tue, Aug 16, 2011 at 10:52 AM, Joe Crobak <joec...@gmail.com> wrote: > >> According to the Flume FAQ [1], Flume ack's events from the CollectorSink >> in E2E mode. If I have a Decorator running on the Collector that filters >> out events (or transforms them or something), does that mean those events >> won't get ACK'd and thus will delivery will be retried for them >> indefinitely? IOW, is E2E mode unsupported in this situation -- or maybe is >> there a way for me to ACK events that I want to filter from the Decorator >> itself? >> >> Thanks, >> Joe >> >> >> [1] https://github.com/cloudera/flume/wiki/FAQ >> > >