[
https://issues.apache.org/jira/browse/STORM-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033296#comment-16033296
]
Stig Rohde Døssing commented on STORM-2538:
-------------------------------------------
[~pshah] Hi Priyank. I'm wondering if you'd be willing to try out a workaround
for this issue? Since the issue is caused by Kafka rebalances, I think you can
resolve this by switching to this Subscription implementation
https://github.com/apache/storm/blob/master/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/ManualPartitionNamedSubscription.java.
You can specify this class when creating your KafkaSpoutConfig builder (e.g.
https://github.com/apache/storm/blob/master/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java#L136).
Switching to the manual Subscription should prevent rebalances from occuring
unless you change the number of partitions for the topic(s) you're consuming
from.
> New kafka spout emits duplicate tuples
> --------------------------------------
>
> Key: STORM-2538
> URL: https://issues.apache.org/jira/browse/STORM-2538
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-kafka-client
> Affects Versions: 2.0.0, 1.x
> Reporter: Priyank Shah
> Assignee: Hugo Louro
> Priority: Critical
> Time Spent: 3h 20m
> Remaining Estimate: 0h
>
> Currently, KafkaSpout in storm-kafka-client can cause duplicate tuples to be
> emitted. Reason is the implementation of ConsumerRebalanceListener interface
> is called by kafka everytime a new executor comes up. However, on
> PartitionsRevoked we already have some in flight tuples and are emitting the
> same ones from the new executor on which the onPartitionsAssigned was called.
> We need to make sure that we emit only one tuple per kafka message.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)