A while back I opened STORM-206 [1] to capture ideas for pulling in “contrib” 
modules to the Apache codebase.

In the past, we had the storm-contrib github project [2] which subsequently got 
broken up into individual projects hosted on the stormprocessor github group 
[3] and elsewhere.

The problem with this approach is that in certain cases it led to code rot 
(modules not being updated in step with Storm’s API), fragmentation (multiple 
similar modules with the same name), and confusion.

A good example of this is the storm-kafka module [4], since it is a widely used 
component. Because storm-contrib wasn’t being tagged in github, a lot of users 
had trouble reconciling with which versions of storm it was compatible. Some 
users built off specific commit hashes, some forked, and a few even pushed 
custom builds to repositories such as clojars. With kafka 0.8 now available, 
there are two main storm-kafka projects, the original (compatible with kafka 
0.7) and an updated fork [5] (compatible with kafka 0.8).

My intention is not to find fault in any way, but rather to point out the 
resulting pain, and work toward a better solution.

I think it would be beneficial to the Storm user community to have certain 
commonly used modules like storm-kafka brought into the Apache Storm project. 
Another benefit worth considering is the licensing/legal oversight that the ASF 
provides, which is important to many users.

If this is something we want to do, then the big question becomes what sort 
governance process needs to be established to ensure that such things are 
properly maintained.

Some random thoughts, questions, etc. that jump to mind include:

What to call these things: “contib modules”, “connectors”, “integration 
modules”, etc.?
Build integration: I imagine they would be a multi-module submodule of the main 
maven build. Probably turned off by default and enabled by a maven profile.
Governance: Have one or more committer volunteers responsible for maintenance, 
merging patches, etc.? Proposal process for pulling new modules?

I look forward to hearing others’ opinions.

- Taylor


[1] https://issues.apache.org/jira/browse/STORM-206
[2] https://github.com/nathanmarz/storm-contrib
[3] https://github.com/stormprocessor
[4] https://github.com/nathanmarz/storm-contrib/tree/master/storm-kafka
[5] https://github.com/wurstmeister/storm-kafka-0.8-plus

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to