[ 
https://issues.apache.org/jira/browse/HUDI-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated HUDI-7930:
-----------------------------
    Fix Version/s: 1.0.0
                       (was: 1.1.0)

> Flink Support for Array of Row and Map of Row value
> ---------------------------------------------------
>
>                 Key: HUDI-7930
>                 URL: https://issues.apache.org/jira/browse/HUDI-7930
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: flink
>    Affects Versions: 0.14.1
>            Reporter: David Perkins
>            Assignee: Danny Chen
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.0.0, 0.15.1
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> I have run into an issue with tables that have an array of rows in Flink. I 
> am able to write data, but after compaction reads produce this exception. 
> {{java.lang.RuntimeException: Unsupported type in the list: optional binary 
> item1 (STRING)}}
> The error only occurs after a compaction happens and produces parquet files. 
> I'm using Hudi 0.14.1 and Flink 1.17.2 writing to Azure ADLS. I have tried 
> 'Merge on Read' and 'Copy on Right' tables.
> Steps to reproduce the error.
> 1. Create a table with an array of rows
> {{CREATE temporary TABLE TestTable (}}
> {{rowId STRING NOT NULL,}}
> {{myArray ARRAY< ROW< item1 STRING, item2 STRING > >}}
> {{) WITH (}}
> {{'connector' = 'hudi',}}
> {{'path' = 
> 'abfs://<container>@<storage_account>.dfs.core.windows.net/hudi/testtable',}}
> {{'table.type' = 'MERGE_ON_READ',}}
> {{'write.batch.size' = '1',}}
> {{'hoodie.compact.inline' = 'true',}}
> {{'hoodie.compact.inline.max.delta.commits' = '1',}}
> {{'compaction.async.enabled' = 'false',}}
> {{'compaction.delta_commits' = '1',}}
> {{'hoodie.datasource.write.recordkey.field' = 'rowId'}}
> {{);}}
> 2. Insert some data
> {{insert into TestTable values}}
> {{('1', ARRAY[ROW('1.item1', '1.item2')]),}}
> {{('2', ARRAY[ROW('2.item1', '2.item2')]),}}
> {{('3', ARRAY[ROW('3.item1', '3.item2')]),}}
> {{('4', ARRAY[ROW('4.item1', '4.item2')]),}}
> {{('5', ARRAY[ROW('5.item1', '5.item2')]),}}
> {{('6', ARRAY[ROW('6.item1', '6.item2')]),}}
> {{('7', ARRAY[ROW('7.item1', '7.item2')]),}}
> {{('8', ARRAY[ROW('8.item1', '8.item2')]),}}
> {{('9', ARRAY[ROW('9.item1', '9.item2')]),}}
> {{('10', ARRAY[ROW('10.item1', '10.item2')])}}
> {{;}}
> 3. Query
> {{Select * from TestTable;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to