[
https://issues.apache.org/jira/browse/FLINK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-20416:
-----------------------------------
Labels: auto-deprioritized-major auto-deprioritized-minor auto-unassigned
pull-request-available (was: auto-deprioritized-major auto-unassigned
pull-request-available stale-minor)
Priority: Not a Priority (was: Minor)
This issue was labeled "stale-minor" 7 days ago and has not received any
updates so it is being deprioritized. If this ticket is actually Minor, please
raise the priority and ask a committer to assign you the issue or revive the
public discussion.
> Need a cached catalog for HiveCatalog
> -------------------------------------
>
> Key: FLINK-20416
> URL: https://issues.apache.org/jira/browse/FLINK-20416
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Common, Connectors / Hive, Table SQL /
> Ecosystem
> Reporter: Sebastian Liu
> Priority: Not a Priority
> Labels: auto-deprioritized-major, auto-deprioritized-minor,
> auto-unassigned, pull-request-available
> Attachments: hms cache.jpg, hms cache.jpg
>
>
> For OLAP scenarios, There are usually some analytical queries which running
> time is relatively short. These queries are also sensitive to latency. In the
> current Blink sql processing, parse/validate/optimize stages are all need
> meta data from catalog API. But each request to the catalog requires re-run
> of the underlying meta query.
>
> We may need a cached catalog which can cache the table schema and statistic
> info to avoid unnecessary repeated meta requests.
> Design
> doc:[https://docs.google.com/document/d/1oL8HUpv2WaF6OkFvbH5iefXkOJB__Dal_bYsIZJA_Gk/edit?usp=sharing]
> I have submitted a related PR for adding a genetic cached catalog, which can
> delegate other implementations of {{AbstractCatalog. }}
> {{[https://github.com/apache/flink/pull/14260]}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)