[jira] [Created] (HDDS-15643) ECFileChecksumHelper: redundant OM lookupKey RPC and per-file gRPC connection creation for EC checksum collection

Andrey Yarovoy (Jira) Mon, 22 Jun 2026 13:22:45 -0700

Andrey Yarovoy created HDDS-15643:
-------------------------------------

             Summary: ECFileChecksumHelper: redundant OM lookupKey RPC and 
per-file gRPC connection creation for EC checksum collection
                 Key: HDDS-15643
                 URL: https://issues.apache.org/jira/browse/HDDS-15643
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Andrey Yarovoy



*Description:*

Checksum collection for EC files has three structural inefficiencies that make 
each file's cost far higher than necessary. All three are present in the 
current code and compound under any non-trivial OM latency.

*Bug 1 — Double {{lookupKey}} RPC per file ({{{}BaseFileChecksumHelper{}}})*

The 7-arg constructor (which accepts a pre-fetched {{{}OmKeyInfo{}}}) delegates 
to {{{}this(6-arg){}}}. The 6-arg constructor calls {{fetchBlocks()}} before 
returning, and {{fetchBlocks()}} checks {{if (keyInfo == null)}} to decide 
whether to issue a {{lookupKey}} RPC. Because {{this.keyInfo = keyInfo}} 
executes only after the delegation returns, {{keyInfo}} is always null at the 
time of that check — so a redundant {{lookupKey}} is fired for every file 
regardless of whether the caller already supplied one.

*Bug 2 — New gRPC connection opened for every file 
({{{}ECFileChecksumHelper{}}})*

{{getChunkInfos()}} builds a 3-node STANDALONE pipeline to read the stripe 
checksum (replica index 1 plus the two parity nodes). It calls 
{{{}pipeline.toBuilder().setNodes(nodes).build(){}}}. 
{{Pipeline.Builder.setNodes()}} detects that the 3-node set differs from the 
5-node EC {{nodeStatus}} and unconditionally calls 
{{{}PipelineID.randomId(){}}}, generating a fresh random UUID per file. Since 
{{XceiverClientManager}} keys its gRPC connection cache on pipeline ID, the 
cache never hits and a new connection is opened for every file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (HDDS-15643) ECFileChecksumHelper: redundant OM lookupKey RPC and per-file gRPC connection creation for EC checksum collection

Reply via email to