We solved this offset sync issue by making our topology idempotent, (we
could do that with our use case)
our storm topology consumes documents from kafka and commits to
elasticsearch & inserting records to cassandra..
our topology can re-consume from beginning of the queue, and the docids and
primary keys are chosen such that the records get overwritten with the same
document.

cheers, /Manish


On Thu, Dec 21, 2017 at 1:23 PM, Stig Rohde Døssing <[email protected]> wrote:

> Hi Nasron,
>
> I don't believe there's currently a tool to help you migrate. We did it
> manually by writing a small utility that looked up the commit offsets in
> Storm's Zookeeper, opened a KafkaConsumer with the new consumer group id
> and committed the offsets for the appropriate partitions. We stopped our
> topologies, used this utility and redeployed with the new spout.
>
> Assuming there isn't already a tool for migration floating around
> somewhere, I think we could probably build some migration support into the
> storm-kafka-client spout. If the path to the old offsets in Storm's
> Zookeeper is given, we might be able to extract them and start up the new
> spout from there.
>
> 2017-12-19 21:59 GMT+01:00 Nasron Cheong <[email protected]>:
>
>> Hi,
>>
>> I'm trying to determine steps for migration to the storm-kafka-client in
>> order to use the new kafka client.
>>
>> It's not quite clear to me how offsets are migrated - is there a specific
>> set of steps to ensure offsets are moved from the ZK based offsets into the
>> kafka based offsets?
>>
>> Or is the original configuration respected, and storm-kafka-client can
>> mostly be a drop in replacement?
>>
>> I want to avoid having spouts reset to the beginning of topics after
>> deployment, due to this change.
>>
>> Thanks.
>>
>> - Nasron
>>
>
>

Reply via email to