I wrote a sink decorator that does regex on an attribute (which I call the
srcAttribute) and places the contents of the capture group within another
attribute (which I call the dstAttribute).

The usage will look something like:
regex_attribute(srcAttribute, dstAttribute, pattern)

On the Flume user mailing list, I was encouraged to contribute this
decorator. Can someone create the JIRA ticket for me and add me as a
contributor?

Text below is my original message to the Flume user mailing list in which I
am describing the use case for the sink decorator:

I want to do output bucketing based on the tailSrcFile metadata value
set by the tailDir source. However, I only want part of the value for
the destination path in HDFS.

For example, I have an event with the tailSrcFile value
"unwanted_prefix_category_name-2011-07-25.log" but only want to
use"category_name" for output bucketing.

What is the easiest way to do this?

Reply via email to