This is an automated email from the ASF dual-hosted git repository.

pvary pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg.git


The following commit(s) were added to refs/heads/main by this push:
     new e8cf33db7d Docs: Add note that snapshot expiration and cleanup orphan 
files could corrupt Flink job state (#9002)
e8cf33db7d is described below

commit e8cf33db7d3fc637504a51a801c055dce54474b7
Author: Rui Li <[email protected]>
AuthorDate: Wed Nov 8 19:40:02 2023 +0800

    Docs: Add note that snapshot expiration and cleanup orphan files could 
corrupt Flink job state (#9002)
---
 docs/flink-writes.md | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/docs/flink-writes.md b/docs/flink-writes.md
index 641fa09e3c..e078a82868 100644
--- a/docs/flink-writes.md
+++ b/docs/flink-writes.md
@@ -270,4 +270,13 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') 
*/
 ...
 ```
 
-Check out all the options here: 
[write-options](/flink-configuration#write-options) 
\ No newline at end of file
+Check out all the options here: 
[write-options](/flink-configuration#write-options) 
+
+## Notes
+
+Flink streaming write jobs rely on snapshot summary to keep the last committed 
checkpoint ID, and
+store uncommitted data as temporary files. Therefore, [expiring 
snapshots](../tables/maintenance#expire-snapshots)
+and [deleting orphan files](../tables/maintenance#delete-orphan-files) could 
possibly corrupt
+the state of the Flink job. To avoid that, make sure to keep the last snapshot 
created by the Flink
+job (which can be identified by the `flink.job-id` property in the summary), 
and only delete
+orphan files that are old enough.
\ No newline at end of file

Reply via email to