[
https://issues.apache.org/jira/browse/IMPALA-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong resolved IMPALA-10117.
------------------------------------
Fix Version/s: Impala 4.0
Resolution: Fixed
> Skip calls to FsPermissionCache for blob stores
> -----------------------------------------------
>
> Key: IMPALA-10117
> URL: https://issues.apache.org/jira/browse/IMPALA-10117
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Sahil Takiar
> Assignee: Tim Armstrong
> Priority: Major
> Labels: performance
> Fix For: Impala 4.0
>
>
> The {{FsPermissionCache}} is described as:
> {code:java}
> /**
> * Simple non-thread-safe cache for resolved file permissions. This allows
> * pre-caching permissions by listing the status of all files within a
> directory,
> * and then using that cache to avoid round trips to the FileSystem for later
> * queries of those paths.
> */ {code}
> I confirmed, and {{FsPermissionCache#precacheChildrenOf}} is actually called
> for data stored on S3. The issue is that {{FsPermissionCache#getPermissions}}
> is called inside {{HdfsTable#getAvailableAccessLevel}}, which is skipped for
> S3. So all the cached metadata is not used. The problem is that
> {{precacheChildrenOf}} calls {{getFileStatus}} for all files, which results
> in a bunch of unnecessary metadata operations to S3 + a bunch of cached
> metadata that is never used.
> {{precacheChildrenOf}} is actually only invoked in the specific scenario
> described below:
> {code}
> // Only preload permissions if the number of partitions to be added is
> // large (3x) relative to the number of existing partitions. This covers
> // two common cases:
> //
> // 1) initial load of a table (no existing partition metadata)
> // 2) ALTER TABLE RECOVER PARTITIONS after creating a table pointing to
> // an already-existing partition directory tree
> //
> // Without this heuristic, we would end up using a "listStatus" call to
> // potentially fetch a bunch of irrelevant information about existing
> // partitions when we only want to know about a small number of
> newly-added
> // partitions.
> {code}
> Regardless, skipping the call to {{precacheChildrenOf}} for blob stores
> should (1) improve table loading time for S3 backed tables, and (2) decrease
> catalogd memory requirements when loading a bunch of tables stored on S3.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)