[ 
https://issues.apache.org/jira/browse/HIVE-18264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-18264:
------------------------------------
    Description: 
Currently we have a separate cache for partitions and partition col stats which 
results in some calls iterating through each of these for retrieving/updating. 
For example, to modify a partition col stat, currently we need to lock table, 
partition and partition col stats caches which are all separate hashmaps. We 
can get better performance by organizing hierarchically. For example, we can 
have a partition, partition col stats and table col stats cache per table to 
improve on the previous mechanisms. This will also result in better 
concurrency, since now instead of locking the whole cache, we can selectively 
lock the table cache and modify multiple tables in parallel. 

In addition, currently, the prewarm mechanism populates all the caches 
initially (it skips tables that do not pass whitelist/blacklist filter) and it 
is a blocking call. This patch also makes prewarm non-blocking so that the 
calls for tables that are already cached can be served from the memory and the 
ones that are not can be served from the rdbms. 

  was:Currently we have a separate cache for partitions and partition col stats 
which results in some calls iterating through each of these for 
retrieving/updating. We can get better performance by organizing 
hierarchically. We should also make prewarm non-blocking


> CachedStore: Store cached partitions/col stats within the table cache and 
> make prewarm non-blocking
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-18264
>                 URL: https://issues.apache.org/jira/browse/HIVE-18264
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Vaibhav Gumashta
>            Assignee: Vaibhav Gumashta
>            Priority: Major
>         Attachments: HIVE-18264.1.patch, HIVE-18264.2.patch, 
> HIVE-18264.3.patch, HIVE-18264.4.patch, HIVE-18264.5.patch
>
>
> Currently we have a separate cache for partitions and partition col stats 
> which results in some calls iterating through each of these for 
> retrieving/updating. For example, to modify a partition col stat, currently 
> we need to lock table, partition and partition col stats caches which are all 
> separate hashmaps. We can get better performance by organizing 
> hierarchically. For example, we can have a partition, partition col stats and 
> table col stats cache per table to improve on the previous mechanisms. This 
> will also result in better concurrency, since now instead of locking the 
> whole cache, we can selectively lock the table cache and modify multiple 
> tables in parallel. 
> In addition, currently, the prewarm mechanism populates all the caches 
> initially (it skips tables that do not pass whitelist/blacklist filter) and 
> it is a blocking call. This patch also makes prewarm non-blocking so that the 
> calls for tables that are already cached can be served from the memory and 
> the ones that are not can be served from the rdbms. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to