Michael Smith created IMPALA-15117:
--------------------------------------
Summary: Optimize getFirstLevelAcidDirPath
Key: IMPALA-15117
URL: https://issues.apache.org/jira/browse/IMPALA-15117
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Affects Versions: Impala 4.5.0
Reporter: Michael Smith
This could also apply to Hive.
When loading ACID table metadata, {{AcidUtils.getFirstLevelAcidDirPath}} checks
whether each candidate ACID subdirectory name (e.g. {{delta_*}}, {{base_*}},
{{delete_delta_*}}) exists under a partition directory by calling
{{FileSystem.isDirectory(path)}} for each candidate in a loop. On distributed
filesystems each call is a separate RPC.
Observed via async-profiler on a production Impala catalogd connected to Apache
Ozone (OFS): 258 samples (94% blocked on RPC) in the stack
{{AcidUtils.getFirstLevelAcidDirPath → FileSystem.isDirectory →
OzoneBucket.getFileStatus → OM gRPC call}}. For a partition with N candidate
paths, N separate {{GetFileStatus}} RPCs are issued where 1 {{listStatus}}
would suffice.
We could replace the N {{isDirectory(candidatePath)}} calls with a single
{{listStatus(partitionDir)}} call and resolve membership locally:
# Call {{fs.listStatus(partitionDir)}} once to get all immediate children with
their type flags.
# Build a local {{Set<String>}} of child directory names from the result.
# Filter candidate ACID names against the set — no additional RPCs.
This reduces the cost from O(N) RPCs to 1 regardless of filesystem
implementation. On object-store-backed filesystems (OFS, S3, ABFS) where each
{{isDirectory}} is a synchronous RPC, the reduction is especially significant.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)