Brian, Let me suggest simple amendment to the usage -- I think most sinks/decos currently use a camelCapsConvetnion. Also, maybe make maybe change the order so it is more similar to regexAll? (this would cleanly allow an eventual regexAttributeAll...)
regexAttribute(srcAttribute, pattern, dstAttribute) Sound good? Jon. On Wed, Aug 10, 2011 at 10:50 AM, Brian Tran <[email protected]> wrote: > I wrote a sink decorator that does regex on an attribute (which I call the > srcAttribute) and places the contents of the capture group within another > attribute (which I call the dstAttribute). > > The usage will look something like: > regex_attribute(srcAttribute, dstAttribute, pattern) > > On the Flume user mailing list, I was encouraged to contribute this > decorator. Can someone create the JIRA ticket for me and add me as a > contributor? > > Text below is my original message to the Flume user mailing list in which I > am describing the use case for the sink decorator: > > I want to do output bucketing based on the tailSrcFile metadata value > set by the tailDir source. However, I only want part of the value for > the destination path in HDFS. > > For example, I have an event with the tailSrcFile value > "unwanted_prefix_category_name-2011-07-25.log" but only want to > use"category_name" for output bucketing. > > What is the easiest way to do this? > -- // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [email protected]
