Stig Rohde Døssing created STORM-2896: -----------------------------------------
Summary: Support automatic migration of offsets from storm-kafka to storm-kafka-client KafkaSpout Key: STORM-2896 URL: https://issues.apache.org/jira/browse/STORM-2896 Project: Apache Storm Issue Type: Improvement Components: storm-kafka-client Affects Versions: 2.0.0, 1.2.0 Reporter: Stig Rohde Døssing I think we can ease migration for people looking to move from storm-kafka to storm-kafka-client. We should be able to support migrating offsets from the old spout by setting some extra configuration in KafkaSpoutConfig, and by adding a new FirstPollOffsetStrategy (e.g. something like FirstPollOffsetStrategy.UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA). The old spout stores offsets in Storm's Zookeeper at one of two paths. The storm-kafka SpoutConfig has two parameters we'll also need, namely zkRoot and id. In addition we need to know if the storm-kafka subscription was a wildcard subscription or not. The zookeeper path for commit info is {code} zkRoot + "/" + id + "/" + topicName + "partition_" + partition {code} if the subscription was a wildcard. Otherwise it is {code} zkRoot + "/" + id + "/" + "partition_" + partition {code} We can get topicName and partition numbers from the KafkaConsumer.assignment. When we run initialize, we should be able to read the old offset structure from Zookeeper when the strategy is UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA, and seek the consumer to those offsets. We can just crash if the offsets are not present. I'd be okay with this feature not being permanent, but I think this feature would make it a lot easier for people to move off the old spout. -- This message was sent by Atlassian JIRA (v6.4.14#64029)