Stig Rohde Døssing created STORM-2896:
-----------------------------------------

             Summary: Support automatic migration of offsets from storm-kafka 
to storm-kafka-client KafkaSpout
                 Key: STORM-2896
                 URL: https://issues.apache.org/jira/browse/STORM-2896
             Project: Apache Storm
          Issue Type: Improvement
          Components: storm-kafka-client
    Affects Versions: 2.0.0, 1.2.0
            Reporter: Stig Rohde Døssing


I think we can ease migration for people looking to move from storm-kafka to 
storm-kafka-client. We should be able to support migrating offsets from the old 
spout by setting some extra configuration in KafkaSpoutConfig, and by adding a 
new FirstPollOffsetStrategy (e.g. something like 
FirstPollOffsetStrategy.UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA).

The old spout stores offsets in Storm's Zookeeper at one of two paths. The 
storm-kafka SpoutConfig has two parameters we'll also need, namely zkRoot and 
id. In addition we need to know if the storm-kafka subscription was a wildcard 
subscription or not.

The zookeeper path for commit info is 
{code}
zkRoot + "/" + id + "/" + topicName + "partition_" + partition
{code}
if the subscription was a wildcard. Otherwise it is 
{code}
zkRoot + "/" + id + "/" + "partition_" + partition
{code}

We can get topicName and partition numbers from the KafkaConsumer.assignment. 
When we run initialize, we should be able to read the old offset structure from 
Zookeeper when the strategy is UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA, and seek 
the consumer to those offsets. We can just crash if the offsets are not present.

I'd be okay with this feature not being permanent, but I think this feature 
would make it a lot easier for people to move off the old spout.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to