Paul, Thanks for the feedback. I looked briefly at Morphline, but wasn't sure if it was what I needed. I will take a deeper dive this week and see if it will do what we want. Ultimately the reason we're not changing the apps is that we honestly don't always have a lot of control. Many of the apps are 3rd party apps where we just barely have the ability to adjust their log-line-formats.
Matt Wise Sr. Systems Architect Nextdoor.com On Mon, Nov 11, 2013 at 3:09 PM, Paul Chavez < pcha...@verticalsearchworks.com> wrote: > I think there may be two ‘out of box’ ways to do this kind of thing. First > would be using the regex extract interceptor with multiple serializers > keying on various fields. However that’s not really dynamic and just kind > of a half-step better from one interceptor for each field as you mentioned. > Second would be to use the morphline interceptor to parse your event body > and insert headers as needed. I have to admit I have no experience with > this interceptor but in reading the documentation it seems designed for > this kind of use case. > > > > Ultimately though, when faced with this we opted to push this into the app > layer. Is there a reason the applications can’t write these key/value pairs > as headers in the first place? We use an HTTP source and when we wrote the > logging class for it on our app side we put similar functionality in as > category/subcategory headers. Then flume doesn’t have to have any special > interceptors beyond a default static one in case the headers are completely > missing, and we write to HDFS with tokenized paths so each permutation of > those headers gets a separate directory. > > > > If you continue to explore this issue, please keep us updated. I > especially would like to hear some real world morphline examples. > > > > Hope that helps, > > Paul Chavez > > > > > > *From:* Matt Wise [mailto:m...@nextdoor.com] > *Sent:* Monday, November 11, 2013 10:04 AM > *To:* user@flume.apache.org > *Subject:* Re: Dynamic Key=Value Parsing with an Interceptor? > > > > Anyone have any ideas on the best way to do this? > > > Matt Wise > > Sr. Systems Architect > > Nextdoor.com > > > > On Sat, Nov 9, 2013 at 5:28 PM, Matt Wise <m...@nextdoor.com> wrote: > > Hey we'd like to set up a default format for all of our logging systems... > perhaps looking like this: > > > > "key1=value1;key2=value2;key3=value3...." > > > > With this pattern, we'd allow developers to define any key/value pairs > they want to log, and separate them with a common separator. > > > > If we did this, what do we need to do in Flume to get Flume to parse out > the key=value pairs into dynamic headers? We pass our data from Flume into > both HDFS and ElasticSearch sinks. We would really like to have these > fields dynamically sent to the sinks for much easier parsing and analysis > later. > > > > Any thoughts on this? I know that we can define a unique interceptor for > each service that looks for explicit field names ... but thats a nightmare > to manage. I really want something truly dynamic. > > > Matt Wise > > Sr. Systems Architect > > Nextdoor.com > > >