teamconfx commented on code in PR #27462:
URL: https://github.com/apache/flink/pull/27462#discussion_r2723443212
##########
flink-runtime/src/test/java/org/apache/flink/runtime/testutils/CommonTestUtils.java:
##########
@@ -70,6 +71,9 @@ public class CommonTestUtils {
private static final long RETRY_INTERVAL = 100L;
+ /** Default timeout for waiting on tasks to reach running state. */
+ public static final Duration DEFAULT_WAIT_FOR_TASKS_TIMEOUT =
Duration.ofMinutes(5);
Review Comment:
I agree this would be a workaround. This may be quite a different case, I
found that if I setup a condition (e.g., node restart) in some Flink uni tests,
then the job would "loss" and the waiting would hang forever, as I described in
the JIRA.
One concrete example I can show you is that in this unit test:
[org.apache.flink.test.streaming.runtime.SinkMetricsITCase](https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkMetricsITCase.java#L111),
if you inject a node restart (restart the taskmanager) between line 110~line
111, then this corrupted `jobID` will hang the unit test forever.
I set to 5 minutes as I would think 5 minutes should be a good timeout value
for a unit test, as I see most unit tests in Flink suite can finish under this
window.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]