[
https://issues.apache.org/jira/browse/HIVE-28094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Soumyakanti Das updated HIVE-28094:
-----------------------------------
Description:
Currently we cache calls to {{getTableInternal}} method in HMS client cache and
query cache. We also cache table ids in the query cache, but not in the HMS
client cache.
To cache {{{}getTableInternal{}}}, we create a CacheKey containing the
{{GetTableRequest}} object. However, we do not check if all the necessary
fields are set in the key. This results in a lot of cache misses, especially
because we rely on {{validWriteIdList}} not being null and {{tableId}} not
being -1. {{GetTableRequest}} object also contains `catName` which is not
always set. All these things result in creating duplicate keys and not using
the caches efficiently.
Moreover, {{getTableInternal}} is called from other APIs that are getting
cached, e.g. {{{}getPartitionsByExprInternal{}}}, so improvements in its
performance will positively affect other APIs too.
RESULTS:
I ran all TPCDS explain cbo queries on my local machine, after cherry-picking
[HIVE-28083: Enable HMS client cache and HMS query cache for Explain
plans|https://github.com/apache/hive/pull/5092/commits/41a766d6a51480edb505fd53661a03c63ef3937a].
Then I analyzed the logs with a simple python script to get min, 25th
percentile, median, 75th percentile, and max for PERFLOG logs with this pattern:
{code:java}
</PERFLOG method=(\w+) start=\d+ end=\d+ duration=(\d+) from=.* HS2-cache>'
{code}
Here are the results.
Without
was:
Currently we cache calls to {{getTableInternal}} method in HMS client cache and
query cache. We also cache table ids in the query cache, but not in the HMS
client cache.
To cache {{{}getTableInternal{}}}, we create a CacheKey containing the
{{GetTableRequest}} object. However, we do not check if all the necessary
fields are set in the key. This results in a lot of cache misses, especially
because we rely on {{validWriteIdList}} not being null and {{tableId}} not
being -1. {{GetTableRequest}} object also contains `catName` which is not
always set. All these things result in creating duplicate keys and not using
the caches efficiently.
Moreover, {{getTableInternal}} is called from other APIs that are getting
cached, e.g. {{getPartitionsByExprInternal}}, so improvements in its
performance will positively affect other APIs too.
> Improve HMS client cache and query cache performance for getTableInternal
> -------------------------------------------------------------------------
>
> Key: HIVE-28094
> URL: https://issues.apache.org/jira/browse/HIVE-28094
> Project: Hive
> Issue Type: Improvement
> Components: Hive
> Affects Versions: 4.0.0-beta-1
> Reporter: Soumyakanti Das
> Assignee: Soumyakanti Das
> Priority: Major
>
> Currently we cache calls to {{getTableInternal}} method in HMS client cache
> and query cache. We also cache table ids in the query cache, but not in the
> HMS client cache.
>
> To cache {{{}getTableInternal{}}}, we create a CacheKey containing the
> {{GetTableRequest}} object. However, we do not check if all the necessary
> fields are set in the key. This results in a lot of cache misses, especially
> because we rely on {{validWriteIdList}} not being null and {{tableId}} not
> being -1. {{GetTableRequest}} object also contains `catName` which is not
> always set. All these things result in creating duplicate keys and not using
> the caches efficiently.
>
> Moreover, {{getTableInternal}} is called from other APIs that are getting
> cached, e.g. {{{}getPartitionsByExprInternal{}}}, so improvements in its
> performance will positively affect other APIs too.
>
> RESULTS:
> I ran all TPCDS explain cbo queries on my local machine, after cherry-picking
> [HIVE-28083: Enable HMS client cache and HMS query cache for Explain
> plans|https://github.com/apache/hive/pull/5092/commits/41a766d6a51480edb505fd53661a03c63ef3937a].
> Then I analyzed the logs with a simple python script to get min, 25th
> percentile, median, 75th percentile, and max for PERFLOG logs with this
> pattern:
> {code:java}
> </PERFLOG method=(\w+) start=\d+ end=\d+ duration=(\d+) from=.* HS2-cache>'
> {code}
> Here are the results.
> Without
--
This message was sent by Atlassian Jira
(v8.20.10#820010)