Thanks Jon. This makes perfect sense and helps a lot. It seems to make a lot of sense to change the decorator we've been working on to add attributes to the events rather than replacing them outright.
Joe On Fri, Aug 19, 2011 at 3:45 AM, Jonathan Hsieh <j...@cloudera.com> wrote: > The acks are generated from checksums of the body of events. So if you > augment your events with new attributes (regex, value) the acks will still > work. However, if you filter out events the checksums between the agentSink > and the collectorSink the checksums won't sum up. > > You can however, put filtering "after" the collector, or do filtering "next > to" the collector. > > Ok because value adds attributes and does not modify the body. > node : <source> | agentE2ESink("ip of collector"); > collector: collectorSource | value("newattr","newvalue") > collectorSink("hdfs://xxxx", ...); > > Ok because filter is before checksums calculated > node : <source> | filterOutEvents agentE2ESink("ip of collector"); > collector: collectorSource | collectorSink("hdfs://xxxx", ...); > > Ok because filter is after checksums are validated. > node : <source> | agentE2ESink("ip of collector"); > collector: collectorSource | collector(xxx) { filterOutEvents > escapedFormatDfs("hdfs://xxxx", ...) } ; > > Not ok -- checksums won't work out because events with checksum info never > get checksum calculation. > node : <source> | agentE2ESink("ip of collector"); > collector: collectorSource | filterOutEvents collectorSink("hdfs://xxxx", > ...); > > Does that make sense? > > Jon. > > > On Wed, Aug 17, 2011 at 2:39 AM, Bao Thai Ngo <baothai...@gmail.com>wrote: > >> Hi, >> >> As far as I understand ACK mechanism should work regardless any decorator >> deployed at Collector as Mingje said. I developed and deployed several >> plug-ins (decorators) that filter out events at Collector side and they work >> well with ACK. Another thing I can suggest is: do not try to develop an ACK >> events part in your decorator. >> >> @Felix: Some advantages for deploying a decorator at collector side are: >> - do not depend on agent side >> - collect data we need and save other data for future needs (what we need >> is just a small part of a very huge data) >> >> just my 2cent. >> >> ~Thai >> >> >> On Tue, Aug 16, 2011 at 9:52 PM, Joe Crobak <joec...@gmail.com> wrote: >> >>> According to the Flume FAQ [1], Flume ack's events from the CollectorSink >>> in E2E mode. If I have a Decorator running on the Collector that filters >>> out events (or transforms them or something), does that mean those events >>> won't get ACK'd and thus will delivery will be retried for them >>> indefinitely? IOW, is E2E mode unsupported in this situation -- or maybe is >>> there a way for me to ACK events that I want to filter from the Decorator >>> itself? >>> >>> Thanks, >>> Joe >>> >>> >>> [1] https://github.com/cloudera/flume/wiki/FAQ >>> >> >> > > > -- > // Jonathan Hsieh (shay) > // Software Engineer, Cloudera > // j...@cloudera.com > > >