[jira] [Closed] (FLINK-29756) Support materialized column to improve query performance for complex types

Jingsong Lee (Jira) Tue, 28 Mar 2023 18:52:05 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-29756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jingsong Lee closed FLINK-29756.
--------------------------------
    Resolution: Fixed

https://github.com/apache/incubator-paimon/issues/735

> Support materialized column to improve query performance for complex types
> --------------------------------------------------------------------------
>
>                 Key: FLINK-29756
>                 URL: https://issues.apache.org/jira/browse/FLINK-29756
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table Store
>    Affects Versions: table-store-0.3.0
>            Reporter: Nicholas Jiang
>            Priority: Minor
>             Fix For: table-store-0.4.0
>
>
> In the world of data warehouse, it is very common to use one or more columns 
> from a complex type such as a map, or to put many subfields into it. These 
> operations can greatly affect query performance because:
>  # These operations are very wasteful IO. For example, if we have a field 
> type of Map, which contains dozens of subfields, we need to read the entire 
> column when reading this column. And Spark will traverse the entire map to 
> get the value of the target key.
>  # Cannot take advantage of vectorized reads when reading nested type columns.
>  # Filter pushdown cannot be used when reading nested columns.
> It is necessary to introduce the materialized column feature in Flink Table 
> Store, which transparently solves the above problems of arbitrary columnar 
> storage (not just Parquet).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-29756) Support materialized column to improve query performance for complex types

Reply via email to