Hunter L created HELIX-794:
------------------------------
Summary: TASK: Fix double-booking of tasks upon Participant
disconnect
Key: HELIX-794
URL: https://issues.apache.org/jira/browse/HELIX-794
Project: Apache Helix
Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L
It's been observed in production use cases that when there are transient
Participant connection issues, the Controller would fail to honor
maxNumberOfTasksPerInstance limit. That is to say, if the user wants only 1
task from a job (limit is set to 1), Helix must assign up to 1 task onto an
instance. But upon short Participant disconnects, we saw 2 tasks in RUNNING at
the same time.
The cause for this is the incorrect calculation of jobConfigLimitation in
AbstractTaskDispatcher. This fixes this by utilizing a Map (assignedPartitions)
to calculate the correct number of tasks to assign.
Changelist:
1. Modify an internal data structure (assignedPartitions)
2. Fix the logic that calculates the number of tasks to assign
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)