[ 
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337079#comment-14337079
 ] 

Alan Gates commented on HIVE-9768:
----------------------------------

I'll leave the questions of whether LLAP should or shouldn't cache metadata to 
the people building LLAP, though I think it only needs to cache the stats and 
security info not catalog data.  Stats at least has a lower freshness 
requirement.

As for allowing external entities to cache metadata and find out when it's 
invalid, I agree there are uses for that.  The metastore already has a listener 
interface where it can fire events anytime a DDL operation happens.  It seems 
you could hook into this and build a cache notifier system that allows caching 
entities to register themselves.  Then with a listener that informed that cache 
notifier every time there was a DDL event the cache notifier could then send 
out notices to the relevant caching entities.

> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata 
> refresh/invalidate command
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-9768
>                 URL: https://issues.apache.org/jira/browse/HIVE-9768
>             Project: Hive
>          Issue Type: New Feature
>          Components: HCatalog, Metastore, Query Planning, Query Processor
>    Affects Versions: llap
>         Environment: HDP 2.2
>            Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running 
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead 
> of fetching the metadata at query time, since that's an extra database query 
> (or possibly HBase query in future HIVE-9452) that must first be completely 
> fullfilled before the Hive LLAP query even starts to run, which would slow 
> down the response to the user if not pre-loaded. Also, any temporary outage 
> of the metadata layer would affect the speed LLAP layer so pre-loading and 
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
> operation, something Impala added later, and now calls "INVALIDATE METADATA" 
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
> command instead.
> (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to