I'm only +1 for pulling in storm-kafka and updating it. Other projects put these contrib modules in a "contrib" folder and keep them managed as completely separate codebases. As it's not actually a "module" necessary for Storm, there's an argument there for doing it that way rather than via the multi-module route.
On Tue, Feb 25, 2014 at 4:39 PM, Milinda Pathirage <[email protected]>wrote: > Hi Taylor, > > I'm +1 for pulling these external libraries into Apache codebase. This > will certainly benifit Strom community. I also like to contribute to > this process. > > Thanks > Milinda > > On Tue, Feb 25, 2014 at 5:28 PM, P. Taylor Goetz <[email protected]> > wrote: > > A while back I opened STORM-206 [1] to capture ideas for pulling in > > "contrib" modules to the Apache codebase. > > > > In the past, we had the storm-contrib github project [2] which > subsequently > > got broken up into individual projects hosted on the stormprocessor > github > > group [3] and elsewhere. > > > > The problem with this approach is that in certain cases it led to code > rot > > (modules not being updated in step with Storm's API), fragmentation > > (multiple similar modules with the same name), and confusion. > > > > A good example of this is the storm-kafka module [4], since it is a > widely > > used component. Because storm-contrib wasn't being tagged in github, a > lot > > of users had trouble reconciling with which versions of storm it was > > compatible. Some users built off specific commit hashes, some forked, > and a > > few even pushed custom builds to repositories such as clojars. With kafka > > 0.8 now available, there are two main storm-kafka projects, the original > > (compatible with kafka 0.7) and an updated fork [5] (compatible with > kafka > > 0.8). > > > > My intention is not to find fault in any way, but rather to point out the > > resulting pain, and work toward a better solution. > > > > I think it would be beneficial to the Storm user community to have > certain > > commonly used modules like storm-kafka brought into the Apache Storm > > project. Another benefit worth considering is the licensing/legal > oversight > > that the ASF provides, which is important to many users. > > > > If this is something we want to do, then the big question becomes what > sort > > governance process needs to be established to ensure that such things are > > properly maintained. > > > > Some random thoughts, questions, etc. that jump to mind include: > > > > What to call these things: "contib modules", "connectors", "integration > > modules", etc.? > > Build integration: I imagine they would be a multi-module submodule of > the > > main maven build. Probably turned off by default and enabled by a maven > > profile. > > Governance: Have one or more committer volunteers responsible for > > maintenance, merging patches, etc.? Proposal process for pulling new > > modules? > > > > > > I look forward to hearing others' opinions. > > > > - Taylor > > > > > > [1] https://issues.apache.org/jira/browse/STORM-206 > > [2] https://github.com/nathanmarz/storm-contrib > > [3] https://github.com/stormprocessor > > [4] https://github.com/nathanmarz/storm-contrib/tree/master/storm-kafka > > [5] https://github.com/wurstmeister/storm-kafka-0.8-plus > > > > -- > Milinda Pathirage > > PhD Student | Research Assistant > School of Informatics and Computing | Data to Insight Center > Indiana University > > twitter: milindalakmal > skype: milinda.pathirage > blog: http://milinda.pathirage.org > -- Twitter: @nathanmarz http://nathanmarz.com
