[jira] [Updated] (FLINK-20416) Need a cached catalog for batch SQL job

Sebastian Liu (Jira) Mon, 30 Nov 2020 00:46:34 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sebastian Liu updated FLINK-20416:
----------------------------------
    Description: 
For OLAP scenarios, There are usually some analytical queries which running 
time is relatively short. These queries are also sensitive to latency. In the 
current Blink sql processing, parse/validate/optimize stages are all need meta 
data from catalog API. But each request to the catalog requires re-run of the 
underlying meta query. 

 

We may need a cached catalog which can cache the table schema and statistic 
info to avoid unnecessary repeated meta requests. 

I have submitted a related PR for adding a genetic cached catalog, which can 
delegate other implementations of {{AbstractCatalog. }}

{{[https://github.com/apache/flink/pull/14260]}}

  was:
For OLAP scenarios, There are usually some analytical queries which running 
time is relatively short. These queries are also sensitive to latency. In the 
current Blink sql processing, parse/validate/optimize stages are all need meta 
data from catalog API. But each request to the catalog requires re-run of the 
underlying meta query. 

 

We may need a cached catalog which can cache the table schema and statistic 
info to avoid unnecessary repeated meta requests. 

I have submitted a related PR for adding a genetic cached catalog, which can 
delegate other implementations of __ {{AbstractCatalog. }}

{{{{https://github.com/apache/flink/pull/14260}}}}


> Need a cached catalog for batch SQL job
> ---------------------------------------
>
>                 Key: FLINK-20416
>                 URL: https://issues.apache.org/jira/browse/FLINK-20416
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / API, Table SQL / Planner
>            Reporter: Sebastian Liu
>            Priority: Major
>              Labels: pull-request-available
>
> For OLAP scenarios, There are usually some analytical queries which running 
> time is relatively short. These queries are also sensitive to latency. In the 
> current Blink sql processing, parse/validate/optimize stages are all need 
> meta data from catalog API. But each request to the catalog requires re-run 
> of the underlying meta query. 
>  
> We may need a cached catalog which can cache the table schema and statistic 
> info to avoid unnecessary repeated meta requests. 
> I have submitted a related PR for adding a genetic cached catalog, which can 
> delegate other implementations of {{AbstractCatalog. }}
> {{[https://github.com/apache/flink/pull/14260]}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20416) Need a cached catalog for batch SQL job

Reply via email to