I wrote a sink decorator that does regex on an attribute (which I call the srcAttribute) and places the contents of the capture group within another attribute (which I call the dstAttribute).
The usage will look something like: regex_attribute(srcAttribute, dstAttribute, pattern) On the Flume user mailing list, I was encouraged to contribute this decorator. Can someone create the JIRA ticket for me and add me as a contributor? Text below is my original message to the Flume user mailing list in which I am describing the use case for the sink decorator: I want to do output bucketing based on the tailSrcFile metadata value set by the tailDir source. However, I only want part of the value for the destination path in HDFS. For example, I have an event with the tailSrcFile value "unwanted_prefix_category_name-2011-07-25.log" but only want to use"category_name" for output bucketing. What is the easiest way to do this?
