Andrey Yarovoy created HDDS-15678:
-------------------------------------

             Summary: OFS isDirectory triggers a full GetFileStatus OM RPC 
including pipeline refresh and block locations
                 Key: HDDS-15678
                 URL: https://issues.apache.org/jira/browse/HDDS-15678
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Andrey Yarovoy


*Problem*

{{BasicRootedOzoneFileSystem.isDirectory(path)}} delegates to Hadoop's base 
{{{}FileSystem.isDirectory{}}}, which calls {{getFileStatus(path)}} and checks 
the result type. For OFS key paths this triggers a full OM {{GetFileStatus}} 
RPC. The OM handler does two things that are unnecessary for a type-only check:
 # For file paths: calls {{refresh(fileKeyInfo)}} — contacts SCM to update 
pipeline/block location info. This is an extra inter-service round-trip (OM → 
SCM) on every probe.
 # Returns complete {{OmKeyInfo}} with {{keyLocationVersions}} (block IDs, 
pipeline, tokens) in the gRPC response — the client deserializes all of it and 
discards it because {{isDirectory}} only inspects the {{isDirectory}} flag.

 

*Root cause*

{{OmKeyArgs}} has an {{isHeadOp}} flag that already controls the pipeline 
refresh: when {{{}isHeadOp=true{}}}, {{refresh()}} and {{sortDatanodes()}} are 
skipped on the OM side. However:
 * {{getFileStatus}} in {{BasicRootedOzoneClientAdapterImpl}} never sets 
{{{}isHeadOp=true{}}}.
 * There is no way to tell the OM to omit block location data from the 
response, so it is always serialized and returned even when the caller has no 
use for it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to