[
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337079#comment-14337079
]
Alan Gates commented on HIVE-9768:
----------------------------------
I'll leave the questions of whether LLAP should or shouldn't cache metadata to
the people building LLAP, though I think it only needs to cache the stats and
security info not catalog data. Stats at least has a lower freshness
requirement.
As for allowing external entities to cache metadata and find out when it's
invalid, I agree there are uses for that. The metastore already has a listener
interface where it can fire events anytime a DDL operation happens. It seems
you could hook into this and build a cache notifier system that allows caching
entities to register themselves. Then with a listener that informed that cache
notifier every time there was a DDL event the cache notifier could then send
out notices to the relevant caching entities.
> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata
> refresh/invalidate command
> -----------------------------------------------------------------------------------------------
>
> Key: HIVE-9768
> URL: https://issues.apache.org/jira/browse/HIVE-9768
> Project: Hive
> Issue Type: New Feature
> Components: HCatalog, Metastore, Query Planning, Query Processor
> Affects Versions: llap
> Environment: HDP 2.2
> Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead
> of fetching the metadata at query time, since that's an extra database query
> (or possibly HBase query in future HIVE-9452) that must first be completely
> fullfilled before the Hive LLAP query even starts to run, which would slow
> down the response to the user if not pre-loaded. Also, any temporary outage
> of the metadata layer would affect the speed LLAP layer so pre-loading and
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata"
> operation, something Impala added later, and now calls "INVALIDATE METADATA"
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive
> command instead.
> (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)