@Jon: The suggestions sound good. I'll incorporate them. For feedback on the "How to Contribute" wiki page, I had no idea that I needed to create a JIRA account as a first step. The instructions led me to believe that someone else would create the JIRA ticket for me after I sent a message to the mailing list. Tha
@ NerdyNick: Thanks for letting me know that I need to set up my own JIRA account. On Thu, Aug 11, 2011 at 3:23 AM, Jonathan Hsieh <[email protected]> wrote: > Brian, > > Let me suggest simple amendment to the usage -- I think most sinks/decos > currently use a camelCapsConvetnion. Also, maybe make maybe change the > order so it is more similar to regexAll? (this would cleanly allow an > eventual regexAttributeAll...) > > regexAttribute(srcAttribute, pattern, dstAttribute) > > Sound good? > Jon. > > On Wed, Aug 10, 2011 at 10:50 AM, Brian Tran <[email protected]> > wrote: > > > I wrote a sink decorator that does regex on an attribute (which I call > the > > srcAttribute) and places the contents of the capture group within another > > attribute (which I call the dstAttribute). > > > > The usage will look something like: > > regex_attribute(srcAttribute, dstAttribute, pattern) > > > > On the Flume user mailing list, I was encouraged to contribute this > > decorator. Can someone create the JIRA ticket for me and add me as a > > contributor? > > > > Text below is my original message to the Flume user mailing list in which > I > > am describing the use case for the sink decorator: > > > > I want to do output bucketing based on the tailSrcFile metadata value > > set by the tailDir source. However, I only want part of the value for > > the destination path in HDFS. > > > > For example, I have an event with the tailSrcFile value > > "unwanted_prefix_category_name-2011-07-25.log" but only want to > > use"category_name" for output bucketing. > > > > What is the easiest way to do this? > > > > > > -- > // Jonathan Hsieh (shay) > // Software Engineer, Cloudera > // [email protected] >
