SHuixo commented on issue #6104: URL: https://github.com/apache/iceberg/issues/6104#issuecomment-1322093088
> Yes, we have to wait it to be merged. Good, looking forward to the merger of this rockdb new feature. > Had a look about your exception log. The reason is the cdc contains a delete row, but the compaction for such case that contains delete files hasn't been supported. This means that in the CDC data that is streaming to Iceberg, don't have a viable data compression scheme for data streams that contain delete operations at this stage? In order to verify whether the **commit exception** in the flow scenario has a similar problem in the batch scenario, we made the following attempts: ``` 1. Enable flink CDC streaming writing iceberg, and the checkpoint is 5 minutes; 2. When the writer is running, start the compression program until a **commit exception** occurs; 3. When the above **commit exception** occurs, stop the CDC data writing program and compression program; 4. Turn on the data compression program again until the program is up and running. ``` > The following figure shows the start and end times when flink CDC writes data to icebergļ¼ <img width="941" alt="stream-write-iceberg" src="https://user-images.githubusercontent.com/20868410/203070483-d39c3107-ca61-4d31-bb30-7d63cf821697.PNG"> > Some of the logs are as follows: [compact-data-when-stream-write.log](https://github.com/apache/iceberg/files/10056982/compact-data-when-stream-write.log) [compact-data-when-write-finish.log](https://github.com/apache/iceberg/files/10056985/compact-data-when-write-finish.log) Here's a question,is it possible to pause the writer for data compression once, and when the data compression is completed, resume the data writing from the checkpoint again, and handle the above commit exception by cyclically suspending, compressing, and writing again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
