Andrey Yarovoy created HDDS-15678:
-------------------------------------
Summary: OFS isDirectory triggers a full GetFileStatus OM RPC
including pipeline refresh and block locations
Key: HDDS-15678
URL: https://issues.apache.org/jira/browse/HDDS-15678
Project: Apache Ozone
Issue Type: Bug
Reporter: Andrey Yarovoy
*Problem*
{{BasicRootedOzoneFileSystem.isDirectory(path)}} delegates to Hadoop's base
{{{}FileSystem.isDirectory{}}}, which calls {{getFileStatus(path)}} and checks
the result type. For OFS key paths this triggers a full OM {{GetFileStatus}}
RPC. The OM handler does two things that are unnecessary for a type-only check:
# For file paths: calls {{refresh(fileKeyInfo)}} — contacts SCM to update
pipeline/block location info. This is an extra inter-service round-trip (OM →
SCM) on every probe.
# Returns complete {{OmKeyInfo}} with {{keyLocationVersions}} (block IDs,
pipeline, tokens) in the gRPC response — the client deserializes all of it and
discards it because {{isDirectory}} only inspects the {{isDirectory}} flag.
*Root cause*
{{OmKeyArgs}} has an {{isHeadOp}} flag that already controls the pipeline
refresh: when {{{}isHeadOp=true{}}}, {{refresh()}} and {{sortDatanodes()}} are
skipped on the OM side. However:
* {{getFileStatus}} in {{BasicRootedOzoneClientAdapterImpl}} never sets
{{{}isHeadOp=true{}}}.
* There is no way to tell the OM to omit block location data from the
response, so it is always serialized and returned even when the caller has no
use for it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]