[GitHub] [iceberg] bkahloon commented on issue #2172: Flink SQL Error when trying to write a Flink changelog table to an Iceberg table

GitBox Mon, 01 Feb 2021 18:14:05 -0800


bkahloon commented on issue #2172:
URL: https://github.com/apache/iceberg/issues/2172#issuecomment-771296624



   @openinx just a follow up question. I ingested the data using the datastream 
Flink CDC api. I then used the FlinkSink in Iceberg to write to the Iceberg 
table. However, I can't seem to figure out this behaviour.
   
   The application reads all the rows in the source db, but then doesn't write 
to the Iceberg table until I cancel the job (it's as if the data gets committed 
and shows up in S3 once I cancel the job). I checked if there was any 
backpressure in the job and there was none.
   
   From reading into the IcebergFilesCommitter, it seems that Iceberg writes 
the files on checkpoints ? (please correct me if I'm wrong, I didn't go through 
the entire implementation). I then enabled checkpoints at an interval of 
30seconds and still same result. Will FlinkSink wait until it reaches the 128Mb 
default parquet files size in Iceberg before it writes out the file ?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] bkahloon commented on issue #2172: Flink SQL Error when trying to write a Flink changelog table to an Iceberg table

Reply via email to