Ewen Cheslack-Postava created KAFKA-2894:
--------------------------------------------

             Summary: WorkerSinkTask doesn't handle rewinding offsets on 
rebalance
                 Key: KAFKA-2894
                 URL: https://issues.apache.org/jira/browse/KAFKA-2894
             Project: Kafka
          Issue Type: Bug
          Components: copycat
    Affects Versions: 0.9.0.0
            Reporter: Ewen Cheslack-Postava
            Assignee: Ewen Cheslack-Postava


rewind() is only invoked at the beginning of each poll(). This means that if a 
rebalance occurs in the poll, it's feasible to get data that doesn't match a 
request to change offsets during the rebalance. I think the consumer will hold 
on to consumer data across the rebalance if it is reassigned the same offset, 
so there may already be data ready to be delivered. Additionally we may already 
have data in an incomplete messageBatch that should be discarded when the 
rewind is requested.

While connectors that care about this (i.e. ones that manage their own offsets) 
can handle this correctly by tracking the offsets they're expecting to see, 
it's a hassle, error prone, an pretty unintuitive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to