Damian Guy created KAFKA-5578:
---------------------------------
Summary: Streams Task Assignor should consider the staleness of
state directories when allocating tasks
Key: KAFKA-5578
URL: https://issues.apache.org/jira/browse/KAFKA-5578
Project: Kafka
Issue Type: Bug
Reporter: Damian Guy
During task assignment we use the presence of a state directory to assign
precedence to which instances should be assigned the task. We first chose
previous active tasks, but then fall back to the existence of a state dir.
Unfortunately we don't take into account the recency of the data from the
available state dirs. So in the case where a task has run on many instances, it
may be that we chose an instance that has relatively old data.
When doing task assignment we should take into consideration the age of the
data in the state dirs. We could use the data from the checkpoint files to
determine which instance is most up-to-date and attempt to assign accordingly
(obviously making sure that tasks are still balanced across available instances)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)