[
https://issues.apache.org/jira/browse/HUDI-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sagar Sumit updated HUDI-3981:
------------------------------
Fix Version/s: 0.13.0
(was: 0.12.0)
> [UMBRELLA] Flink engine support for comprehensive schema evolution(RFC-33)
> --------------------------------------------------------------------------
>
> Key: HUDI-3981
> URL: https://issues.apache.org/jira/browse/HUDI-3981
> Project: Apache Hudi
> Issue Type: New Feature
> Components: flink
> Reporter: Alexander Trushev
> Assignee: Alexander Trushev
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.13.0
>
>
> h3. Context
> Currently, there is no support of schema evolution presented in RFC-33 in
> flink engine.
> Example 1. Assume spark changes type of column:
> {code:sql}
> set hoodie.schema.on.read.enable=true
> create table t1 (id int, val int, par string) ... partitioned by (par)
> insert into t1 values (1, 10, 'p1')
> alter table t1 alter column val type string
> insert into t1 values (2, 'val20', 'p2')
> {code}
> When flink tries to read t1:
> {code:sql}
> create table t1 (id int, val string, par string) partitioned by (par) with
> (...)
> select * from t1
> {code}
> the error occurs:
> {noformat}
> java.lang.IllegalArgumentException: Unexpected type: INT32
> {noformat}
> This is just an example, errors may differ in the case of
> cow/mor/snapshot/incremental/batch/streaming/rename column/add column.
> Also it is not yet possible to write data when schema is changed.
> Example 2. Case below leads to errors
> {noformat}
> flink: write data
> flink: stop job
> spark: modify schema according to RFC-33
> flink: new job with modified schema
> flink: write data
> {noformat}
> h3. Proposal
> Provide full support in flink engine when schema is modified according to
> RFC-33
> add column, rename column, change type of column, drop column when:
> # batch/streaming
> # mor (snapshot/incremental/optimized) read/write
> # cow (snapshot/incremental) read/write
> # mor compaction
--
This message was sent by Atlassian Jira
(v8.20.10#820010)