Hey Jeremy, That comment has been in the code now for some time, but I don't think it is actually enforced anywhere programatically. I think the idea was just that if you are writing something which is capable of generating new event data it should be in a source - though I'm also curious to hear why this was put in there.
IMHO, doing some type of event splitting seems within the scope of how interceptors are used. - Patrick On Fri, Aug 10, 2012 at 11:07 AM, Jeremy Custenborder <[email protected]> wrote: > Hello All, > > I'm wondering if you could provide some guidance for me. One of the > inputs I'm working with batches several entries to a single event. > This is a lot simpler than my data but it provides an easy example. > For example: > > timestamp - 5,4,3,2,1 > timestamp - 9,7,5,5,6 > > If I tail the file this results in 2 events being generated. This > example has the data for 10 events. > > Here is high level what I want to accomplish. > (web server - agent 1) > exec source tail -f /<some file path> > collector-client to (agent 2) > > (collector - agent 2) > collector-server > Custom Interceptor (input 1 event, output n events) > Multiplex to > hdfs > hbase > > An interceptor looked like the most logical spot for me to add this. > Is there a better place to add this functionality? Has anyone run into > a similar case? > > Looking at the docs for Interceptor. intercept(List<Event> events) it > says "Output list of events. The size of output list MUST NOT BE > GREATER than the size of the input list (i.e. transformation and > removal ONLY)." which tells me not to emit more events than given. > intercept(Event event) only returns a single event so I can't use it > there either. Why is there a requirement to only return 1 for 1? > > For now I'm implementing a custom source that will handle generating > multiple events from the events coming in on the web server. My > preference was to do this transformation on the collector agent before > I hand off to hdfs and hbase. I know another alternative would be to > implement custom RPC but I would prefer not to do that. I would prefer > to rely on what is currently available. > > Thanks! > j
