I'd be okay with just creating the issue and opening PRs. Looking at the mailing list history, I think we've made do with DISCUSS threads before making changes many times before. The only non-release VOTE I can find from the last year was for introducing checkstyle.
2018-02-13 8:55 GMT+01:00 Jungtaek Lim <[email protected]>: > I forgot the one, but this is great time to revisit this, since we got > resolved many storm-kafka-client issues hence feeling it as fairly stable. > > I'm +1 on deprecate storm-kafka at 1.x version lines and remove it at > 2.0.0. Do we want to have explicit VOTE to receive more opinions here? > > Thanks, > Jungtaek Lim (HeartSaVioR) > > 2017년 7월 20일 (목) 오전 5:10, P. Taylor Goetz <[email protected]>님이 작성: > > > +1 I’m fine with taking this approach. > > > > -Taylor > > > > > On Jul 19, 2017, at 2:04 PM, Stig Rohde Døssing < > [email protected]> > > wrote: > > > > > > +1 for removing storm-kafka from master, since we shouldn't encourage > > > people to use a component that won't work on new Kafka versions. As you > > > both mentioned, the 1.x version of storm-kafka should still be usable > on > > a > > > 2.0 cluster, so it will still be available in case people need it. A > wiki > > > page for tracking current missing pieces for storm-kafka-client sounds > > good. > > > > > > 2017-07-19 19:09 GMT+02:00 Harsha <[email protected]>: > > > > > >> +1 on moving away from storm-kafka for Storm 2.0. For existing users > we > > >> can provide any critical bug fixes and provide it as part of 1.x > > >> releases. They can still use the existing 1.x storm-kafka against 2.0. > > >> Since kafka itself is moving away from older APIs continuing two > > >> versions of kafka connector doesnt’ make sense and honestly splits the > > >> usage which doesn’t give us any feedback on new storm-kafka-client. > > >> Thanks, > > >> Harsha > > >> > > >> On Wed, Jul 19, 2017, at 09:20 AM, Hugo Da Cruz Louro wrote: > > >>> Hi, > > >>> > > >>> The goal of this email is to summarize and unify the discussion > started > > >>> across several email threads (Storm 2.0 > > >>> Roadmap<http://search-hadoop.com/?project=Storm&q=%22% > > >> 5BDISCUSS%5D+Storm+2.0+Roadmap%22>, > > >>> 1.1.1 Release > > >>> Planning<http://search-hadoop.com/m/Storm/8gnYyGagLDWv1qG? > > >> subj=Release+Planning+for+1+1+1+and+others+>, > > >>> Lag > > >>> Issues<http://search-hadoop.com/m/Storm/8gnYyLmjIjYr692? > > >> subj=Lag+issues+using+Storm+1+1+1+latest+build+with+ > > >> StormKafkaClient+1+1+1+vs+old+StormKafka+spouts>) > > >>> concerning the maintenance, branch support, and eventual deprecation > of > > >>> storm-kafka and storm-kafka-client. > > >>> > > >>> It was proposed in an earlier > > >>> discussion<http://search-hadoop.com/?project=Storm&q=% > > >> 22%5BDISCUSS%5D+Storm+2.0+Roadmap%22> > > >>> the plan to deprecate storm-kafka in prol of storm-kafka-client. To > > >>> clarify, the idea is not to completely eliminate storm-kafka, but > > rather > > >>> keep supporting it in the 1.x-branch, while removing it from master > > (i.e. > > >>> Storm 2.0 onwards). That is, storm-kafka-client will then become the > > only > > >>> Storm Kafka option available for Storm 2.0 onwards, given that we > have > > >>> enough confidence in its stability by the time of the Storm 2.0 > > release. > > >>> > > >>> The main reason for this proposal is the fact that the Kafka > community > > >>> agreed<https://cwiki.apache.org/confluence/display/KAFKA/ > > >> KIP-109:+Old+Consumer+Deprecation> > > >>> to deprecate the old consumer APIs starting in version 0.10.2, and > will > > >>> remove them in the next major version (0.12). This implies that > > >>> storm-kafka will not work for Kafka 0.12 onwards. Important features > > >>> missing in the old Kafka consumer are: security, new message format, > > and > > >>> fetching offsets based on time stamp (KIP-79). > > >>> > > >>> In earlier discussions the Storm community has shown concerns about > the > > >>> performance and stability of the storm-kafka-client. Those concerns > are > > >>> valid and were mirrored by the Kafka community in their early > > deprecation > > >>> discussions. I align with what was said in the Kafka > > >>> discussion<http://search-hadoop.com/m/Kafka/uyzND1e4bUP1Rjq721>: the > > >>> storm-kafka-client has bugs, but so does storm-kafka, and all the > > >>> development is currently going into storm-kafka-client, which will be > > >>> even more prevalent in face of Kafka discontinuing the old consumer > > >>> API’s. The only way to stabilize a complex component such as > > >>> storm-kafka-client is to test it extensively in all its variants, > which > > >>> inevitably comes from users using it. Furthermore, removing > storm-kafka > > >>> from Storm 2.0 does not prevent users from still referring to > > storm-kafka > > >>> version 1.x in their topologies. > > >>> > > >>> I did a quick analysis of the JIRA issues for storm-kafka and > > >>> storm-kafka-client [1]. As of July 11 there are 22 open or > in-progress > > >>> bugs for storm-kafka (1 blocker) and 15 for storm-kafka-client. > > >>> > > >>> The recent refactoring around manual partition assignment should > solve > > a > > >>> lot of edge case bugs that occurred during rebalance. There are also > a > > >>> few open pull requests for Trident and fixing some internal state > > >>> details such as maxUncommittedOffsets, topic compaction, etc. > > >>> Nevertheless, there are several areas that need to be addressed to > > >>> stabilize and improve storm-kafka-client. Similarly to what was done > > for > > >>> Storm SQL I suggest that we create a wiki page where we can > centralize > > >>> some points of action such as: > > >>> > > >>> Features / Stability > > >>> * Memory Footprint > > >>> * Retrial Mechanism > > >>> * Exactly once and at least once guarantees > > >>> * Kafka Lag > > >>> * Metrics > > >>> * Spout Internals (e.g. maxUncommittedOffsets, ack, emitted, failed, > > >>> ...) > > >>> * Autocommit mode > > >>> > > >>> Performance. > > >>> * Run performance benchmarks > > >>> > > >>> Integration Testing > > >>> * Test for exactly once in non failure scenarios (e.g. > > >>> activate/deactivate) > > >>> * Test for at least once in failure scenarios > > >>> * Test Trident guarantees > > >>> > > >>> Unit Testing > > >>> * Identify unit test coverage and find a modular way to continually > add > > >>> new tests > > >>> > > >>> Trident > > >>> * Pull request<https://github.com/apache/storm/pull/2174> for > review > > >>> > > >>> API > > >>> * Investigate for gaps in API between storm-kafka and > > >>> storm-kafka-client. > > >>> * Can we discontinue the old API ? > > >>> > > >>> Documentation > > >>> * Check for accuracy and completeness of documentation > > >>> * Make clean code snippets with examples available > > >>> > > >>> [1] - The data was extracted from JIRA on 07/11/2017. The > > >>> storm-kafka-client JIRAs were checked for correctness of component > > label, > > >>> and had their status updated. None of that was done for the > storm-kafka > > >>> JIRAs, therefore some of its issues marked as open may already have > > been > > >>> fixed. The results and charts can be found here: > > >>> * > > >>> storm-kafka-jiras<https://docs.google.com/spreadsheets/d/ > > >> 1pdqAKDtqfhPrfgFxnQa4bSrKP1YBdMyuGzqr3gLzcMA/edit?usp=sharing> > > >>> * > > >>> storm-kafka-client-jiras<https://docs.google.com/spreadsheets/d/ > > >> 12g0HLz4pgODMVVOmzvti1nzLOa6iygmk8pyTOv8op1c/edit?usp=sharing> > > >> > > > > >
