Tobias Pfeiffer created SPARK-2383:
--------------------------------------
Summary: With auto.offset.reset, KafkaReceiver potentially deletes
Consumer nodes fromo Zookeeper
Key: SPARK-2383
URL: https://issues.apache.org/jira/browse/SPARK-2383
Project: Spark
Issue Type: Bug
Components: Streaming
Reporter: Tobias Pfeiffer
When auto.offset.reset is set in the Kafka configuration, then
{{KafkaReceiver}}'s {{tryZookeeperConsumerGroupCleanup()}} will delete the
whole /consume/<groupId> tree in Zookeeper before creating consumer nodes. If
there are already consumer nodes present (this may happen when multiple
KafkaReceivers in the same consumer group are launched), they are deleted as
well, leading to subsequent NoNode exceptions, for example, on rebalance.
There should be a check before the delete like {{if (zk.countChildren(dir +
"/ids") == 0) ...}} (ideally in an atomic way) in order to prevent deleting
existing consumer nodes.
(Also note that the behavior of auto.offset.reset as realized by Spark's Kafka
receiver differs from the behavior defined in Kafka's documentation.)
--
This message was sent by Atlassian JIRA
(v6.2#6252)