[ 
https://issues.apache.org/jira/browse/FLINK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272769#comment-17272769
 ] 

Sebastian Liu commented on FLINK-20416:
---------------------------------------

Hi [~jark], [~lirui],

Thx a lot for the more elegant design mentioned above. It's cleaner and well 
layered. 

I propose to change the original HiveMetastoreClientWrapper class to be an 
interface, which abstracts its current basic methods. Add two implementations, 
CachingHiveMetastoreClient and DefaultHiveMetastoreClient. The 
DefaultHiveMetastoreClient class is the original HiveMetastoreClientWrapper 
class. Then, we can keep the original HiveMetastoreClientWrapper class clean 
and increase the flexibility. Meanwhile, I have updated the related design doc. 

What do you think of this? Looking forward for the further discussion.

!hms cache.jpg!

> Need a cached catalog for HiveCatalog
> -------------------------------------
>
>                 Key: FLINK-20416
>                 URL: https://issues.apache.org/jira/browse/FLINK-20416
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Common, Connectors / Hive, Table SQL / API, 
> Table SQL / Planner
>            Reporter: Sebastian Liu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: hms cache.jpg, hms cache.jpg
>
>
> For OLAP scenarios, There are usually some analytical queries which running 
> time is relatively short. These queries are also sensitive to latency. In the 
> current Blink sql processing, parse/validate/optimize stages are all need 
> meta data from catalog API. But each request to the catalog requires re-run 
> of the underlying meta query. 
>  
> We may need a cached catalog which can cache the table schema and statistic 
> info to avoid unnecessary repeated meta requests. 
> Design 
> doc:[https://docs.google.com/document/d/1oL8HUpv2WaF6OkFvbH5iefXkOJB__Dal_bYsIZJA_Gk/edit?usp=sharing]
> I have submitted a related PR for adding a genetic cached catalog, which can 
> delegate other implementations of {{AbstractCatalog. }}
> {{[https://github.com/apache/flink/pull/14260]}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to