[
https://issues.apache.org/jira/browse/IMPALA-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang resolved IMPALA-9858.
------------------------------------
Fix Version/s: Impala 4.0
Resolution: Fixed
> Wrong partition hit/request metrics in profile of LocalCatalog
> --------------------------------------------------------------
>
> Key: IMPALA-9858
> URL: https://issues.apache.org/jira/browse/IMPALA-9858
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 3.1.0, Impala 3.2.0, Impala 3.3.0, Impala 3.4.0
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Major
> Fix For: Impala 4.0
>
>
> The LocalCatalog metrics of "CatalogFetch.Partitions.Hits" and
> "CatalogFetch.Partitions.Requests" in the profile are overcounted. For query
> "select * from functional.alltypes" where "functional.alltypes" contains 24
> partitions, the partition metrics on a cold-started LocalCatalog coordinator
> are:
> {code:java}
> - CatalogFetch.Partitions.Hits: 48
> - CatalogFetch.Partitions.Misses: 24
> - CatalogFetch.Partitions.Requests: 72{code}
> Actually, only 48 requests are made on the local cache. 24 of them come from
> partition pruning and encounter cache miss. Another 24 of them come fromĀ
> LocalFsTable.toThriftDescriptor() and hit cache.
> The overcounting is due to a bug at
> [https://github.com/apache/impala/blob/f8c28f8adfd781727c311b15546a532ce65881e0/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java#L898]
> Code snipper:
> {code:java}
> public Map<String, PartitionMetadata> loadPartitionsByRefs(TableMetaRef
> table,
> ......
> final int numHits = refToMeta.size();
> final int numMisses = partitionRefs.size() - numHits;
> // Load the remainder from the catalogd.
> List<PartitionRef> missingRefs = new ArrayList<>();
> for (PartitionRef ref: partitionRefs) {
> if (!refToMeta.containsKey(ref)) missingRefs.add(ref);
> }
> if (!missingRefs.isEmpty()) {
> Map<PartitionRef, PartitionMetadata> fromCatalogd =
> loadPartitionsFromCatalogd(
> refImpl, hostIndex, missingRefs);
> refToMeta.putAll(fromCatalogd); // <---- refToMeta is updated here!
> // Write back to the cache.
> storePartitionsInCache(refImpl, hostIndex, fromCatalogd);
> }
> sw.stop();
> addStatsToProfile(PARTITIONS_STATS_CATEGORY, refToMeta.size(), numMisses,
> sw); // <--- Should use numHits instead of refToMeta.size()
> LOG.trace("Request for partitions of {}: hit {}/{}", table,
> refToMeta.size(),
> partitionRefs.size());
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]