Flumeeeees: Flume has evolved over the last few years and has come a long way. I think, to hit the next bar of reliability, maintainability, and adoption, some of the core bits need some refactoring / design retrofit. To this end, I've started a "revolutionary branch[1]." I've listed some of my rationale as to why I think this is a good thing in the JIRA, but I'm happy to go into detail here.
My main motivation for this comes from working on Flume and supporting it in my day job at Cloudera. That said, I do this as an individual, and with my ASF hat firmly in place. My (short) rational: * I think the code base is too complex and that this is a barrier to greater developer adoption. The internals shouldn't be scary. * Some of the invariants of Flume have varied and remnants gum up the works. For instance, there was a time where it was assumed there wouldn't be multiple logical nodes per physical node; the complexities of the threading came later. * A few advertised features do not work as we'd expect / like. I want to make it simpler to add these features. * A number of recent bugs have exposed some evolutionary implementation that could use refactoring. * Flume does too much. It should do a smaller number of things (that people really need / use) and do them exceedingly well. It's become clear that some features are more important to people than others. The details: * The branch is at http://svn.apache.org/viewvc/incubator/flume/branches/flume-728/ * There is already a (significantly smaller) core of Flume and a skeletal Flume node. * The wiki page tracking my notes and the "project" is at https://cwiki.apache.org/confluence/display/FLUME/Flume+NG * The parent JIRA tracking the project is at https://issues.apache.org/jira/browse/FLUME-728 The process / intent: * I intend to move extremely fast on the flume-728 branch and then request a series of strict reviews and call for a vote to merge to trunk. I'm happy to take reviews in the interim. * I'd love folks to get involved and have this become a group effort. The reason I started was to have a baseline to speak from and show 1. that's I'm serious (via code) and 2. what I think an implementation could look like. * I fully understand the community / PPMC may -1 the merge (but that would make me sad, so why would you do that?). I also immediately regretted using the "NG" designation; it's presumptuous and I apologize. Going forward, I'll refer to it as flume-728. Excited to hear feedback or questions. Thanks. [1] jmhsieh pointed out an email from Long Ago(tm) that described this situation well. I'm following that approach, in spirit. http://incubator.apache.org/learn/rules-for-revolutionaries.html -- Eric Sammer twitter: esammer data: www.cloudera.com
