[ 
https://issues.apache.org/jira/browse/FLINK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272004#comment-17272004
 ] 

Sebastian Liu commented on FLINK-20416:
---------------------------------------

Hi [~jark],

Thx a lot for your further suggestions. I really didn't think about the 
is_generic example which you mentioned. If we have some entries which is in 
HiveCatalog but not in underlying Hive Metastore, I also think the catalog 
level cache is unsuitable.

I agree with the approache#2, which is use the HiveMetastoreClient level cache. 
And I have updated the above design doc. The core is adding cache entry in the 
HiveMetastoreClientWrapper as below illustration.

What do you think of this changing? Looking forward to the next suggestion.

!hms cache.jpg!

 
[thinked|http://dict.youdao.com/search?q=thinked&keyfrom=chrome.extension]   
[详细|http://dict.youdao.com/search?q=thinked&keyfrom=chrome.extension]X
网络释义
[thinked:|http://dict.youdao.com/search?q=thinked&keyfrom=chrome.extension&le=eng]
 那时

> Need a cached catalog for HiveCatalog
> -------------------------------------
>
>                 Key: FLINK-20416
>                 URL: https://issues.apache.org/jira/browse/FLINK-20416
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Common, Connectors / Hive, Table SQL / API, 
> Table SQL / Planner
>            Reporter: Sebastian Liu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: hms cache.jpg
>
>
> For OLAP scenarios, There are usually some analytical queries which running 
> time is relatively short. These queries are also sensitive to latency. In the 
> current Blink sql processing, parse/validate/optimize stages are all need 
> meta data from catalog API. But each request to the catalog requires re-run 
> of the underlying meta query. 
>  
> We may need a cached catalog which can cache the table schema and statistic 
> info to avoid unnecessary repeated meta requests. 
> Design 
> doc:[https://docs.google.com/document/d/1oL8HUpv2WaF6OkFvbH5iefXkOJB__Dal_bYsIZJA_Gk/edit?usp=sharing]
> I have submitted a related PR for adding a genetic cached catalog, which can 
> delegate other implementations of {{AbstractCatalog. }}
> {{[https://github.com/apache/flink/pull/14260]}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to