Hi Mike, Thanks for the assistance. Yes, your advice is along the lines of what I've been thinking. However, I'll probs do everything withing a custom source, and not use interceptors.
Regards, Harish On Wed, Oct 3, 2012 at 4:15 PM, Mike Percy <[email protected]> wrote: > Harish, > It sounds like a deserialization problem in a custom Source. I would > recommend doing that deserialization in the Source. > > If you need to do inspection and tagging for routing purposes, that sounds > like a good fit for either an Interceptor and/or the multiplexing channel > selector. > > Does that sound like something that would work for your case? > > Regards, > Mike > > On Wed, Oct 3, 2012 at 12:53 PM, Harish Mandala <[email protected] > >wrote: > > > Hi Mike, > > > > Sure. Here's my use case: > > > > I receive over an HTTP port large log files containing an array of a > > certain object, serialized as JSON. I need to deserialize each log file > > into its constituent array objects. Each object may be routed to a > > different location in HDFS. Also, I need to place various parts of each > of > > theose objects in different locations in HDFS. The solution I thought of > > was to break each event (whose data would be a large JSON log file) into > > many smaller events (which would contain an object or object component), > > put certain headers on them, and route them to the right destination in > > HDFS using a channel selector. > > > > Thanks, > > Harish > > > > On Wed, Oct 3, 2012 at 2:10 PM, Mike Percy <[email protected]> wrote: > > > > > Hi Harish, > > > Why do you want to do that? Can you describe your use case? > > > > > > Regards, > > > Mike > > > > > > On Tue, Oct 2, 2012 at 1:28 PM, Harish Mandala < > [email protected] > > > >wrote: > > > > > > > Hello, > > > > > > > > Alright, so maybe interceptors were not exactly what I wanted. > > > > > > > > It seems the number of events going into an interceptor must equal > the > > > > number coming out. However, what if I need to take out the data from > a > > > > certain event, and create multiple events from subsets of the data > > which > > > > would then be multiplexed using the selector to different locations. > > > Would > > > > the job of splitting one event into many best be done in a Source or > > > Sink? > > > > > > > > I was contemplating modifying the AvroSource or AvroSink for my > > purposes. > > > > However, it seems the sink also tallies output event counts and input > > > event > > > > counts, and makes sure they're the same. That leaves me the option of > > > > writing a custom source based off the AvroSource. Is my thinking > > correct? > > > > > > > > Thanks, > > > > Harish > > > > > > > > On Mon, Oct 1, 2012 at 6:45 PM, Harish Mandala < > > [email protected] > > > > >wrote: > > > > > > > > > Hi Percy, > > > > > > > > > > Thanks! Interceptors seem good enough. > > > > > > > > > > Regards, > > > > > Harish > > > > > > > > > > > > > > > On Mon, Oct 1, 2012 at 6:32 PM, Mike Percy <[email protected]> > > wrote: > > > > > > > > > >> Hi Harish, > > > > >> At this time Flume NG doesn't support unbatching or sink-side > > plugins. > > > > >> Interceptors provide source-side tagging, filtering, and > > > transformation > > > > >> capability, however. > > > > >> > > > > >> Regards, > > > > >> Mike > > > > >> > > > > >> > > > > >> On Mon, Oct 1, 2012 at 3:23 PM, Harish Mandala < > > > [email protected] > > > > >> >wrote: > > > > >> > > > > >> > Hello, > > > > >> > > > > > >> > Am I right in thinking Flume NG no longer has the concept of > Sink > > > > >> > Decorators? I wanted to do some custom deserialization on > incoming > > > > event > > > > >> > data, and split one event into several (De-batching and > > re-routing). > > > > >> What's > > > > >> > the best way to implement this in Flume NG? > > > > >> > > > > > >> > Thanks, > > > > >> > Harish > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > >
