chihsuan opened a new pull request, #10633:
URL: https://github.com/apache/ozone/pull/10633

   ## What changes were proposed in this pull request?
   
   **Problem.** During a streaming write the pipeline must be ordered so the 
nearest datanode is the primary. Today OM sends the client address to SCM on 
every `allocateBlock` and SCM does the sort, putting work on SCM's 
block-allocation hot path, even though OM already caches the cluster topology 
(HDDS-9343) and sorts reads locally.
   
   **Fix.** Sort the streaming-write pipeline on OM instead of SCM, mirroring 
the read path:
   
   - `OMKeyRequest.allocateBlock` now passes an empty `clientMachine` to SCM 
(so SCM skips sorting) and sorts each returned pipeline locally via a new 
`KeyManager.sortDatanodesForWrite`, reusing OM's cached cluster map and 
existing client-resolution helpers.
   - Unlike the read sort, the order is preserved (no shuffle) when the client 
is empty or unresolved, since the first node is the write primary.
   - The sorted result is cached per pipeline, so multiple blocks sharing one 
pipeline are sorted once.
   - `SCMBlockProtocolServer` is left unchanged for rolling-upgrade safety: an 
old OM still sends the address and gets SCM-side sorting, a new OM's empty 
address is a no-op for SCM. No Protobuf/RPC change. Removing the now-redundant 
SCM-side sort is left to a follow-up.
   
   Reviewer notes:
   
   - SCM's `ALLOCATE_BLOCK` audit no longer logs the client address for 
OM-originated writes; the authoritative per-client audit stays at OM.
   - Topology-aware write ordering now requires OM's network-topology / 
dns-to-switch config (the same prerequisite reads have had since HDDS-9343). 
Without it, pipelines keep their original order rather than being misordered.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-15059
   
   ## How was this patch tested?
   
   - Unit (`TestOMAllocateBlockRequest`): SCM receives an empty `clientMachine` 
even when sorting is requested; blocks sharing a pipeline are sorted once.
   - Integration (`TestOMSortDatanodes`): nearest datanode is first for writes; 
order is preserved (no shuffle) for an unresolved or empty client.
   - Fork `build-branch` CI: 
https://github.com/chihsuan/ozone/actions/runs/28383627826
   
   Generated-by: Claude Code (Claude Opus 4.8)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to