[
https://issues.apache.org/jira/browse/FLINK-26388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias Pohl updated FLINK-26388:
----------------------------------
Labels: release-testing (was: )
> Release Testing: Repeatable Cleanup (FLINK-25433)
> -------------------------------------------------
>
> Key: FLINK-26388
> URL: https://issues.apache.org/jira/browse/FLINK-26388
> Project: Flink
> Issue Type: New Feature
> Components: Runtime / Coordination
> Affects Versions: 1.15.0
> Reporter: Matthias Pohl
> Priority: Blocker
> Labels: release-testing
> Fix For: 1.15.0
>
>
> Repeatable cleanup got introduced with
> [FLIP-194|https://issues.apache.org/jira/projects/FLINK/issues/FLINK-26284?filter=allopenissues]
> but should be considered as an independent feature of the {{JobResultStore}}
> (JRS) from a user's point of view. The documentation efforts are finalized
> with FLINK-26296.
> Repeatable cleanup can be triggered by running into an error while cleaning
> up. This can be achieved by disabling access to S3 after the job finished,
> e.g.:
> * Setting a reasonable enough checkpointing time (checkpointing should be
> enabled to allow cleanup of s3)
> * Disable s3 (removing permissions or shutting down the s3 server)
> * Stop job with savepoint
> Stopping the job should work but the logs should show failure with repeating
> retries. Enabling S3 again should fix the issue.
> Keep in mind that if testing this in with HA, you should use a different
> bucket for the file-based JRS artifacts only change permissions for the
> bucket that holds JRS-unrelated artifacts. Flink would fail fatally if the
> JRS is not able to access it's backend storage.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)