Natea Eshetu Beshada created FLINK-39025:
--------------------------------------------
Summary: Add HashMap index for InstanceID lookups in
ResourceManager
Key: FLINK-39025
URL: https://issues.apache.org/jira/browse/FLINK-39025
Project: Flink
Issue Type: Improvement
Components: Runtime / Coordination
Reporter: Natea Eshetu Beshada
Assignee: Natea Eshetu Beshada
ResourceManager.getWorkerByInstanceId() currently performs O(n) linear scan
through all registered TaskExecutors:
{{ protected WorkerType getWorkerByInstanceId(InstanceID instanceId) {}}
{{ WorkerType worker = null;}}
{{ // TODO: Improve performance by having an index on the instanceId}}
{{ for (Map.Entry<ResourceID, WorkerRegistration<WorkerType>> entry :}}
{{ taskExecutors.entrySet()) {}}
{{ if (entry.getValue().getInstanceID().equals(instanceId)) {}}
{{ worker = entry.getValue().getWorker();}}
{{ break;}}
{{ }}}
{{ }}}
{{ return worker;}}
{{ }}}
There is an existing TODO in the code acknowledging this.
Proposed Change:
Add a secondary index Map<InstanceID, WorkerRegistration<WorkerType>>
maintained alongside the existing taskExecutors map:
- Populate on TaskExecutor registration
- Remove on TaskExecutor deregistration
- Replace linear scan with O(1) HashMap lookup
Impact:
- Improves scheduling performance in large clusters (1000+ TaskManagers)
- Minimal memory overhead (one additional Map reference per TaskExecutor)
- No API changes, no user-facing impact
--
This message was sent by Atlassian Jira
(v8.20.10#820010)