Damian Guy created KAFKA-5578:
---------------------------------

             Summary: Streams Task Assignor should consider the staleness of 
state directories when allocating tasks
                 Key: KAFKA-5578
                 URL: https://issues.apache.org/jira/browse/KAFKA-5578
             Project: Kafka
          Issue Type: Bug
            Reporter: Damian Guy


During task assignment we use the presence of a state directory to assign 
precedence to which instances should be assigned the task. We first chose 
previous active tasks, but then fall back to the existence of a state dir. 
Unfortunately we don't take into account the recency of the data from the 
available state dirs. So in the case where a task has run on many instances, it 
may be that we chose an instance that has relatively old data.

When doing task assignment we should take into consideration the age of the 
data in the state dirs. We could use the data from the checkpoint files to 
determine which instance is most up-to-date and attempt to assign accordingly 
(obviously making sure that tasks are still balanced across available instances)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to