Myasuka commented on a change in pull request #8820: [FLINK-12916][tests] Retry
cancelWithSavepoint on cancellation barrier in AbstractOperatorRestoreTestBase
URL: https://github.com/apache/flink/pull/8820#discussion_r303517298
##########
File path:
flink-tests/src/test/java/org/apache/flink/test/state/operator/restore/AbstractOperatorRestoreTestBase.java
##########
@@ -66,12 +72,16 @@
private static final int NUM_TMS = 1;
private static final int NUM_SLOTS_PER_TM = 4;
private static final Duration TEST_TIMEOUT = Duration.ofSeconds(10000L);
- private static final Pattern
PATTERN_CANCEL_WITH_SAVEPOINT_TOLERATED_EXCEPTIONS = Pattern
- .compile(
- "(was not running)" +
- "|(Not all required tasks are currently
running)" +
- "|(Checkpoint was declined \\(tasks not
ready\\))"
- );
+
+ private static final Pattern
PATTERN_CANCEL_WITH_SAVEPOINT_TOLERATED_EXCEPTIONS = Pattern.compile(
+ Stream.of(
+ TRIGGER_SAVEPOINT_FAILURE.message(),
Review comment:
IMO, it's okay to trigger savepoint again when receiving such a general case
when triggering savepoint. This test targets on migrate and restore with
savepoints not to verify why savepoint failed. My previous commit would set the
timeout as 300 seconds, but after I rebase with latest code, the timeout turned
to previous 10000 seconds again. I hope to ignore this error and change default
timeout to 300 seconds. What do you think?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services