[ 
https://issues.apache.org/jira/browse/HUDI-7354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated HUDI-7354:
--------------------------------
    Attachment: flink-hudi.sql

> Flink Batch Read from Hudi table does not return any rows
> ---------------------------------------------------------
>
>                 Key: HUDI-7354
>                 URL: https://issues.apache.org/jira/browse/HUDI-7354
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: flink-sql
>    Affects Versions: 0.14.1
>            Reporter: Prabhu Joseph
>            Priority: Major
>         Attachments: cleanup.sql, flink-hudi.sql
>
>
> Flink Batch Read from Hudi table does not return any rows. The same flink sql 
> script returns 8 rows as expected on 0.14.0 Hudi version.
> *Repro Steps*
>  1. Flink 1.18.1 and Hudi 0.14.0
> 2. Open Flink YARN Session
> {code}
> flink-yarn-session -d -D execution.checkpointing.interval=10s -D 
> state.checkpoint-storage=filesystem  -D 
> state.checkpoints.dir=s3://prabhuflinks3/test-output/flink/output/20eab3b1-d58a-491c-8819-15e451a549eb
> {code}
> 3. Place CSV Input Data
> {code}
> cat > data <<EOF
> 1,Danny,23
> 2,Stephen,33
> 3,Julian,53
> 4,Fabian,31
> 5,Sophia,18
> 6,Emma,20
> 7,Bob,44
> 8,Han,56
> EOF
> hadoop fs -mkdir -p 
> s3://prabhuflinks3/test-output/flink/output/8d007d79-913d-4ed4-a6e4-9af591f24c36/csvinput/
> hadoop fs -put data 
> s3://prabhuflinks3/test-output/flink/output/8d007d79-913d-4ed4-a6e4-9af591f24c36/csvinput/
> {code}
> 4. Run attached Flink sql (flink-hudi.sql) script
> {code}
> /usr/lib/flink/bin/sql-client.sh -f flink-hudi.sql
> {code}
> The script makes a flink filesystem table with CSV data of 8 rows. Then, it 
> forms a Hudi table and puts in the data from the filesystem table. Finally, 
> it runs a select query from the Hudi table. The select query does not return 
> any data.
> 5. Cleanup the tables and databases using cleanup.sql
> *Analysis*
> The select query and insert query run together. The select query ends quickly 
> since the Hudi table has no data yet. In Hudi 0.14.0, the select query waits 
> until the data loads and then retrieves it.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to