[
https://issues.apache.org/jira/browse/FLINK-22368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325641#comment-17325641
]
Arvid Heise edited comment on FLINK-22368 at 4/20/21, 9:30 AM:
---------------------------------------------------------------
The test doesn't finish as checkpointing gets stuck in the last execution
attempt (5):
{noformat}
23:02:26,104 [flink-akka.actor.default-dispatcher-4] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job Flink
Streaming Job (5d70bcb288d90589845e39c2953b27c3) switched from state RESTARTING
to RUNNING.
23:02:26,118 [flink-akka.actor.default-dispatcher-4] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source:
source (2/20) (2d3357d530b11041d123bde87da7584b) switched from INITIALIZING to
RUNNING.
... (in total all 100 tasks are RUNNING)
23:02:26,347 [flink-akka.actor.default-dispatcher-2] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - failing-map
(10/20) (23870b8b94e5ea774ca3da72a7ca7251) switched from INITIALIZING to
RUNNING.
...
23:02:27,165 [ Checkpoint Timer] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Failed to
trigger checkpoint for job 5d70bcb288d90589845e39c2953b27c3 since some tasks of
job 5d70bcb288d90589845e39c2953b27c3 has been finished, abort the checkpoint
Failure reason: Not all required tasks are currently running.
23:02:28,165 [ Checkpoint Timer] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Failed to
trigger checkpoint for job 5d70bcb288d90589845e39c2953b27c3 since some tasks of
job 5d70bcb288d90589845e39c2953b27c3 has been finished, abort the checkpoint
Failure reason: Not all required tasks are currently running.
... (in total 10k failed to trigger...)
01:55:56,165 [ Checkpoint Timer] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Failed to
trigger checkpoint for job 5d70bcb288d90589845e39c2953b27c3 since some tasks of
job 5d70bcb288d90589845e39c2953b27c3 has been finished, abort the checkpoint
Failure reason: Not all required tasks are currently running.
{noformat}
was (Author: aheise):
The test doesn't finish as checkpointing gets stuck in the last execution
attempt (5):
{noformat}
23:02:26,104 [flink-akka.actor.default-dispatcher-4] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job Flink
Streaming Job (5d70bcb288d90589845e39c2953b27c3) switched from state RESTARTING
to RUNNING.
23:02:26,118 [flink-akka.actor.default-dispatcher-4] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source:
source (2/20) (2d3357d530b11041d123bde87da7584b) switched from INITIALIZING to
RUNNING.
... (in total 100 tasks are RUNNING)
23:02:26,347 [flink-akka.actor.default-dispatcher-2] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - failing-map
(10/20) (23870b8b94e5ea774ca3da72a7ca7251) switched from INITIALIZING to
RUNNING.
...
23:02:27,165 [ Checkpoint Timer] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Failed to
trigger checkpoint for job 5d70bcb288d90589845e39c2953b27c3 since some tasks of
job 5d70bcb288d90589845e39c2953b27c3 has been finished, abort the checkpoint
Failure reason: Not all required tasks are currently running.
23:02:28,165 [ Checkpoint Timer] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Failed to
trigger checkpoint for job 5d70bcb288d90589845e39c2953b27c3 since some tasks of
job 5d70bcb288d90589845e39c2953b27c3 has been finished, abort the checkpoint
Failure reason: Not all required tasks are currently running.
... (in total 10k failed to trigger...)
01:55:56,165 [ Checkpoint Timer] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Failed to
trigger checkpoint for job 5d70bcb288d90589845e39c2953b27c3 since some tasks of
job 5d70bcb288d90589845e39c2953b27c3 has been finished, abort the checkpoint
Failure reason: Not all required tasks are currently running.
{noformat}
> UnalignedCheckpointITCase hangs on azure
> ----------------------------------------
>
> Key: FLINK-22368
> URL: https://issues.apache.org/jira/browse/FLINK-22368
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 1.13.0
> Reporter: Dawid Wysakowicz
> Priority: Critical
> Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16818&view=logs&j=b0a398c0-685b-599c-eb57-c8c2a771138e&t=d13f554f-d4b9-50f8-30ee-d49c6fb0b3cc&l=10144
--
This message was sent by Atlassian Jira
(v8.3.4#803005)