I'd be okay with just creating the issue and opening PRs. Looking at the
mailing list history, I think we've made do with DISCUSS threads before
making changes many times before. The only non-release VOTE I can find from
the last year was for introducing checkstyle.

2018-02-13 8:55 GMT+01:00 Jungtaek Lim <kabh...@gmail.com>:

> I forgot the one, but this is great time to revisit this, since we got
> resolved many storm-kafka-client issues hence feeling it as fairly stable.
>
> I'm +1 on deprecate storm-kafka at 1.x version lines and remove it at
> 2.0.0. Do we want to have explicit VOTE to receive more opinions here?
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2017년 7월 20일 (목) 오전 5:10, P. Taylor Goetz <ptgo...@gmail.com>님이 작성:
>
> > +1 I’m fine with taking this approach.
> >
> > -Taylor
> >
> > > On Jul 19, 2017, at 2:04 PM, Stig Rohde Døssing <
> stigdoess...@gmail.com>
> > wrote:
> > >
> > > +1 for removing storm-kafka from master, since we shouldn't encourage
> > > people to use a component that won't work on new Kafka versions. As you
> > > both mentioned, the 1.x version of storm-kafka should still be usable
> on
> > a
> > > 2.0 cluster, so it will still be available in case people need it. A
> wiki
> > > page for tracking current missing pieces for storm-kafka-client sounds
> > good.
> > >
> > > 2017-07-19 19:09 GMT+02:00 Harsha <st...@harsha.io>:
> > >
> > >> +1 on moving away from storm-kafka for Storm 2.0. For existing users
> we
> > >> can provide any critical bug fixes and provide it as part of 1.x
> > >> releases. They can still use the existing 1.x storm-kafka against 2.0.
> > >> Since kafka itself is moving away from older APIs continuing two
> > >> versions of kafka connector doesnt’ make sense and honestly splits the
> > >> usage which doesn’t give us any feedback on new storm-kafka-client.
> > >> Thanks,
> > >> Harsha
> > >>
> > >> On Wed, Jul 19, 2017, at 09:20 AM, Hugo Da Cruz Louro wrote:
> > >>> Hi,
> > >>>
> > >>> The goal of this email is to summarize and unify the discussion
> started
> > >>> across several email threads (Storm 2.0
> > >>> Roadmap<http://search-hadoop.com/?project=Storm&q=%22%
> > >> 5BDISCUSS%5D+Storm+2.0+Roadmap%22>,
> > >>> 1.1.1 Release
> > >>> Planning<http://search-hadoop.com/m/Storm/8gnYyGagLDWv1qG?
> > >> subj=Release+Planning+for+1+1+1+and+others+>,
> > >>> Lag
> > >>> Issues<http://search-hadoop.com/m/Storm/8gnYyLmjIjYr692?
> > >> subj=Lag+issues+using+Storm+1+1+1+latest+build+with+
> > >> StormKafkaClient+1+1+1+vs+old+StormKafka+spouts>)
> > >>> concerning the maintenance, branch support, and eventual deprecation
> of
> > >>> storm-kafka and storm-kafka-client.
> > >>>
> > >>> It was proposed in an earlier
> > >>> discussion<http://search-hadoop.com/?project=Storm&q=%
> > >> 22%5BDISCUSS%5D+Storm+2.0+Roadmap%22>
> > >>> the plan to deprecate storm-kafka in prol of storm-kafka-client. To
> > >>> clarify, the idea is not to completely eliminate storm-kafka, but
> > rather
> > >>> keep supporting it in the 1.x-branch, while removing it from master
> > (i.e.
> > >>> Storm 2.0 onwards). That is, storm-kafka-client will then become the
> > only
> > >>> Storm Kafka option available for Storm 2.0 onwards, given that we
> have
> > >>> enough confidence in its stability by the time of the Storm 2.0
> > release.
> > >>>
> > >>> The main reason for this proposal is the fact that the Kafka
> community
> > >>> agreed<https://cwiki.apache.org/confluence/display/KAFKA/
> > >> KIP-109:+Old+Consumer+Deprecation>
> > >>> to deprecate the old consumer APIs starting in version 0.10.2, and
> will
> > >>> remove them in the next major version (0.12). This implies that
> > >>> storm-kafka will not work for Kafka 0.12 onwards. Important features
> > >>> missing in the old Kafka consumer are: security, new message format,
> > and
> > >>> fetching offsets based on time stamp (KIP-79).
> > >>>
> > >>> In earlier discussions the Storm community has shown concerns about
> the
> > >>> performance and stability of the storm-kafka-client. Those concerns
> are
> > >>> valid and were mirrored by the Kafka community in their early
> > deprecation
> > >>> discussions. I align with what was said in the Kafka
> > >>> discussion<http://search-hadoop.com/m/Kafka/uyzND1e4bUP1Rjq721>: the
> > >>> storm-kafka-client has bugs, but so does storm-kafka, and all the
> > >>> development is currently going into storm-kafka-client, which will be
> > >>> even more prevalent in face of Kafka discontinuing the old consumer
> > >>> API’s. The only way to stabilize a complex component such as
> > >>> storm-kafka-client is to test it extensively in all its variants,
> which
> > >>> inevitably comes from users using it. Furthermore, removing
> storm-kafka
> > >>> from Storm 2.0 does not prevent users from still referring to
> > storm-kafka
> > >>> version 1.x in their topologies.
> > >>>
> > >>> I did a quick analysis of the JIRA issues for storm-kafka and
> > >>> storm-kafka-client [1].  As of July 11 there are 22 open or
> in-progress
> > >>> bugs for storm-kafka (1 blocker) and 15 for storm-kafka-client.
> > >>>
> > >>> The recent refactoring around manual partition assignment should
> solve
> > a
> > >>> lot of edge case bugs that occurred during rebalance. There are also
> a
> > >>> few open pull requests for Trident  and fixing some internal state
> > >>> details such as maxUncommittedOffsets, topic compaction, etc.
> > >>> Nevertheless, there are several areas that need to be addressed to
> > >>> stabilize and improve storm-kafka-client. Similarly to what was done
> > for
> > >>> Storm SQL I suggest that we create a wiki page where we can
> centralize
> > >>> some points of action such as:
> > >>>
> > >>> Features / Stability
> > >>> * Memory Footprint
> > >>> * Retrial Mechanism
> > >>> * Exactly once and at least once guarantees
> > >>> * Kafka Lag
> > >>> * Metrics
> > >>> * Spout Internals (e.g. maxUncommittedOffsets, ack, emitted, failed,
> > >>> ...)
> > >>> * Autocommit mode
> > >>>
> > >>> Performance.
> > >>> * Run performance benchmarks
> > >>>
> > >>> Integration Testing
> > >>> * Test for exactly once in non failure scenarios (e.g.
> > >>> activate/deactivate)
> > >>> * Test for at least once in failure scenarios
> > >>> * Test Trident guarantees
> > >>>
> > >>> Unit Testing
> > >>> * Identify unit test coverage and find a modular way to continually
> add
> > >>> new tests
> > >>>
> > >>> Trident
> > >>>  * Pull request<https://github.com/apache/storm/pull/2174> for
> review
> > >>>
> > >>> API
> > >>>  * Investigate for gaps in API between storm-kafka and
> > >>>  storm-kafka-client.
> > >>>  * Can we discontinue the old API ?
> > >>>
> > >>> Documentation
> > >>>  * Check for accuracy and completeness of documentation
> > >>>  * Make clean code snippets with examples available
> > >>>
> > >>> [1] - The data was extracted from JIRA on 07/11/2017. The
> > >>> storm-kafka-client JIRAs were checked for correctness of component
> > label,
> > >>> and had their status updated. None of that was done for the
> storm-kafka
> > >>> JIRAs, therefore some of its issues marked as open may already have
> > been
> > >>> fixed. The results and charts can be found here:
> > >>>    *
> > >>>    storm-kafka-jiras<https://docs.google.com/spreadsheets/d/
> > >> 1pdqAKDtqfhPrfgFxnQa4bSrKP1YBdMyuGzqr3gLzcMA/edit?usp=sharing>
> > >>>    *
> > >>>    storm-kafka-client-jiras<https://docs.google.com/spreadsheets/d/
> > >> 12g0HLz4pgODMVVOmzvti1nzLOa6iygmk8pyTOv8op1c/edit?usp=sharing>
> > >>
> >
> >
>

Reply via email to