[GitHub] [iceberg] ayush-san commented on issue #2900: Flink CDC job getting failed due to G1 old gc and large checkpointing time

GitBox Tue, 10 Aug 2021 04:46:20 -0700


ayush-san commented on issue #2900:
URL: https://github.com/apache/iceberg/issues/2900#issuecomment-895960999



   I ran the following maintenance procedure on my streaming table and the 
metadata size was reduced considerably. Also checkpoint for this table came 
down to 700ms from earlier 8-9 mins.
   
   ```
   Actions.forTable(table).rewriteDataFiles().targetSizeInBytes(256 * 1024 * 
1024).execute();
   spark.sql("CALL hive.system.rewrite_manifests('db_name.table_name')").show()
   spark.sql("CALL hive.system.expire_snapshots(table => 'db_name.table_name', 
older_than => 1628428025000, retain_last => 5)").show()
   spark.sql("CALL catalog_name.system.remove_orphan_files(table => 
'db_name.table_name')").show()
   ```
   
   But running expire_snapshots action leads to a bigger issue of flink job not 
being able to resume from the checkpoint due to this 
https://github.com/apache/iceberg/issues/2482 
   
   Error: `org.apache.iceberg.exceptions.ValidationException: Cannot determine 
history between starting snapshot null and current 7571686194699158451`
   
   @stevenzwu @rdblue Is there any plan to update expire-snapshot action 
implementation? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] ayush-san commented on issue #2900: Flink CDC job getting failed due to G1 old gc and large checkpointing time

Reply via email to