Hi Mike,

Thanks for the assistance. Yes, your advice is along the lines of what I've
been thinking. However, I'll probs do everything withing a custom source,
and not use interceptors.

Regards,
Harish

On Wed, Oct 3, 2012 at 4:15 PM, Mike Percy <[email protected]> wrote:

> Harish,
> It sounds like a deserialization problem in a custom Source. I would
> recommend doing that deserialization in the Source.
>
> If you need to do inspection and tagging for routing purposes, that sounds
> like a good fit for either an Interceptor and/or the multiplexing channel
> selector.
>
> Does that sound like something that would work for your case?
>
> Regards,
> Mike
>
> On Wed, Oct 3, 2012 at 12:53 PM, Harish Mandala <[email protected]
> >wrote:
>
> > Hi Mike,
> >
> > Sure. Here's my use case:
> >
> > I receive over an HTTP port large log files containing an array of a
> > certain object, serialized as JSON. I need to deserialize each log file
> > into its constituent array objects. Each object may be routed to a
> > different location in HDFS. Also, I need to place various parts of each
> of
> > theose objects in different locations in HDFS. The solution I thought of
> > was to break each event (whose data would be a large JSON log file) into
> > many smaller events (which would contain an object or object component),
> > put certain headers on them, and route them to the right destination in
> > HDFS using a channel selector.
> >
> > Thanks,
> > Harish
> >
> > On Wed, Oct 3, 2012 at 2:10 PM, Mike Percy <[email protected]> wrote:
> >
> > > Hi Harish,
> > > Why do you want to do that? Can you describe your use case?
> > >
> > > Regards,
> > > Mike
> > >
> > > On Tue, Oct 2, 2012 at 1:28 PM, Harish Mandala <
> [email protected]
> > > >wrote:
> > >
> > > > Hello,
> > > >
> > > > Alright, so maybe interceptors were not exactly what I wanted.
> > > >
> > > > It seems the number of events going into an interceptor must equal
> the
> > > > number coming out. However, what if I need to take out the data from
> a
> > > > certain event, and create multiple events from subsets of the data
> > which
> > > > would then be multiplexed using the selector to different locations.
> > > Would
> > > > the job of splitting one event into many best be done in a Source or
> > > Sink?
> > > >
> > > > I was contemplating modifying the AvroSource or AvroSink for my
> > purposes.
> > > > However, it seems the sink also tallies output event counts and input
> > > event
> > > > counts, and makes sure they're the same. That leaves me the option of
> > > > writing a custom source based off the AvroSource. Is my thinking
> > correct?
> > > >
> > > > Thanks,
> > > > Harish
> > > >
> > > > On Mon, Oct 1, 2012 at 6:45 PM, Harish Mandala <
> > [email protected]
> > > > >wrote:
> > > >
> > > > > Hi Percy,
> > > > >
> > > > > Thanks! Interceptors seem good enough.
> > > > >
> > > > > Regards,
> > > > > Harish
> > > > >
> > > > >
> > > > > On Mon, Oct 1, 2012 at 6:32 PM, Mike Percy <[email protected]>
> > wrote:
> > > > >
> > > > >> Hi Harish,
> > > > >> At this time Flume NG doesn't support unbatching or sink-side
> > plugins.
> > > > >> Interceptors provide source-side tagging, filtering, and
> > > transformation
> > > > >> capability, however.
> > > > >>
> > > > >> Regards,
> > > > >> Mike
> > > > >>
> > > > >>
> > > > >> On Mon, Oct 1, 2012 at 3:23 PM, Harish Mandala <
> > > [email protected]
> > > > >> >wrote:
> > > > >>
> > > > >> > Hello,
> > > > >> >
> > > > >> > Am I right in thinking Flume NG no longer has the concept of
> Sink
> > > > >> > Decorators? I wanted to do some custom deserialization on
> incoming
> > > > event
> > > > >> > data, and split one event into several (De-batching and
> > re-routing).
> > > > >> What's
> > > > >> > the best way to implement this in Flume NG?
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Harish
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to