[ 
https://issues.apache.org/jira/browse/FLINK-20124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231425#comment-17231425
 ] 

Robert Metzger commented on FLINK-20124:
----------------------------------------

Ran a complicated, misbehaving job, and found this problem with unaligned 
checkpoints: https://issues.apache.org/jira/browse/FLINK-20145

Otherwise, the job works fine so far, after a few hours. 

During cancellation, I stumbled across this log message, but I guess we can not 
do anything about it, because we are just printing the current stack, and this 
just happened to be in Flink code twice:
{code}
2020-11-13 13:29:57,000 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - 
Un-registering task and sending final execution state CANCELED to JobManager 
for task Flat Map (3/4)#0 079036e2a439aac3bed0063f3f8f6a2c.
2020-11-13 13:30:26,602 WARN  org.apache.flink.runtime.taskmanager.Task         
           [] - Task 'Source: control events generator (3/4)#0' did not react 
to cancelling signal for 30 seconds, but is stuck in method:
 sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.take(TaskMailboxImpl.java:146)
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:299)
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:184)
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:577)
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:541)
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)
java.lang.Thread.run(Thread.java:748)

2020-11-13 13:30:56,608 WARN  org.apache.flink.runtime.taskmanager.Task         
           [] - Task 'Source: control events generator (3/4)#0' did not react 
to cancelling signal for 30 seconds, but is stuck in method:
 sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1934)
org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:620)
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:554)
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)
java.lang.Thread.run(Thread.java:748)
{code}

> Test pipelined region scheduler
> -------------------------------
>
>                 Key: FLINK-20124
>                 URL: https://issues.apache.org/jira/browse/FLINK-20124
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.12.0
>            Reporter: Robert Metzger
>            Assignee: Robert Metzger
>            Priority: Critical
>             Fix For: 1.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to