Harish, It sounds like a deserialization problem in a custom Source. I would recommend doing that deserialization in the Source.
If you need to do inspection and tagging for routing purposes, that sounds like a good fit for either an Interceptor and/or the multiplexing channel selector. Does that sound like something that would work for your case? Regards, Mike On Wed, Oct 3, 2012 at 12:53 PM, Harish Mandala <[email protected]>wrote: > Hi Mike, > > Sure. Here's my use case: > > I receive over an HTTP port large log files containing an array of a > certain object, serialized as JSON. I need to deserialize each log file > into its constituent array objects. Each object may be routed to a > different location in HDFS. Also, I need to place various parts of each of > theose objects in different locations in HDFS. The solution I thought of > was to break each event (whose data would be a large JSON log file) into > many smaller events (which would contain an object or object component), > put certain headers on them, and route them to the right destination in > HDFS using a channel selector. > > Thanks, > Harish > > On Wed, Oct 3, 2012 at 2:10 PM, Mike Percy <[email protected]> wrote: > > > Hi Harish, > > Why do you want to do that? Can you describe your use case? > > > > Regards, > > Mike > > > > On Tue, Oct 2, 2012 at 1:28 PM, Harish Mandala <[email protected] > > >wrote: > > > > > Hello, > > > > > > Alright, so maybe interceptors were not exactly what I wanted. > > > > > > It seems the number of events going into an interceptor must equal the > > > number coming out. However, what if I need to take out the data from a > > > certain event, and create multiple events from subsets of the data > which > > > would then be multiplexed using the selector to different locations. > > Would > > > the job of splitting one event into many best be done in a Source or > > Sink? > > > > > > I was contemplating modifying the AvroSource or AvroSink for my > purposes. > > > However, it seems the sink also tallies output event counts and input > > event > > > counts, and makes sure they're the same. That leaves me the option of > > > writing a custom source based off the AvroSource. Is my thinking > correct? > > > > > > Thanks, > > > Harish > > > > > > On Mon, Oct 1, 2012 at 6:45 PM, Harish Mandala < > [email protected] > > > >wrote: > > > > > > > Hi Percy, > > > > > > > > Thanks! Interceptors seem good enough. > > > > > > > > Regards, > > > > Harish > > > > > > > > > > > > On Mon, Oct 1, 2012 at 6:32 PM, Mike Percy <[email protected]> > wrote: > > > > > > > >> Hi Harish, > > > >> At this time Flume NG doesn't support unbatching or sink-side > plugins. > > > >> Interceptors provide source-side tagging, filtering, and > > transformation > > > >> capability, however. > > > >> > > > >> Regards, > > > >> Mike > > > >> > > > >> > > > >> On Mon, Oct 1, 2012 at 3:23 PM, Harish Mandala < > > [email protected] > > > >> >wrote: > > > >> > > > >> > Hello, > > > >> > > > > >> > Am I right in thinking Flume NG no longer has the concept of Sink > > > >> > Decorators? I wanted to do some custom deserialization on incoming > > > event > > > >> > data, and split one event into several (De-batching and > re-routing). > > > >> What's > > > >> > the best way to implement this in Flume NG? > > > >> > > > > >> > Thanks, > > > >> > Harish > > > >> > > > > >> > > > > > > > > > > > > > >
