[
https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Naveen Gangam reassigned HIVE-21718:
------------------------------------
> Improvement performance of UpdateInputAccessTimeHook
> ----------------------------------------------------
>
> Key: HIVE-21718
> URL: https://issues.apache.org/jira/browse/HIVE-21718
> Project: Hive
> Issue Type: Improvement
> Components: HiveServer2
> Affects Versions: 2.1.1
> Reporter: Naveen Gangam
> Assignee: Naveen Gangam
> Priority: Major
>
> Currently, Hive does not update the lastAccessTime property for any entities
> when a query accesses them. Thus it has not possible to know when a table was
> last accessed.
> Hive does provide a configurable hook to HS2 that is execcuted as a pre-query
> hook prior to the query being executed. However, this hook is inefficient
> because for each table or partition it is attempting to update time for, it
> executes an "alter table ... " command internally. This is bad
> 1) For a query touching 1000's of partitions, this hook takes forever to
> update them.
> 2) Meanwhile, it is holding up the original query from executing.
> So even though we do not recommend using the hook, because the reward is too
> little (having lastAccessTime updated), we realize there is no other means to
> achieve this.
> Also, we can improve the performance of the hook significantly by adding a
> new thrift API on HMS to update the lastAccessTime on the database rows
> directly instead of going to HMS front end for 1 entity at time (leading to
> 1000's of HMS calls that lead to multiple 1000's of calls to the database).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)