[
https://issues.apache.org/jira/browse/FLINK-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198227#comment-15198227
]
Ufuk Celebi commented on FLINK-3627:
------------------------------------
In line 317 the manual watermark context tries to acquire the
{{lockingObject}}. I'm wondering who is holding that lock. Is it possible to
gather a stack trace of the task manager process running the locked task via
{{jstack PID}} (when this occurs again)? Maybe [~aljoscha] has an idea who
could be holding that lock.
> Task stuck on lock in StreamSource when cancelling
> --------------------------------------------------
>
> Key: FLINK-3627
> URL: https://issues.apache.org/jira/browse/FLINK-3627
> Project: Flink
> Issue Type: Bug
> Components: Core
> Reporter: Jamie Grier
> Labels: hang
>
> I've seen this occur a couple of times when the # of network buffers is set
> too low. The job fails with the an appropriate message indicating that the
> user should increase the # of network buffers. However, some of the task
> threads then hang with a stack trace similar to the following.
> 2016-03-16 13:38:54,017 WARN org.apache.flink.runtime.taskmanager.Task
> - Task 'Source: EventGenerator -> (Flat Map, blah -> Filter ->
> Projection -> Flat Map -> Timestamps/Watermarks -> Map) (46/144)' did not
> react to cancelling signal, but is stuck in method:
>
> org.apache.flink.streaming.api.operators.StreamSource$ManualWatermarkContext.collect(StreamSource.java:317)
> flink.benchmark.generator.LoadGeneratorSource.run(LoadGeneratorSource.java:38)
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:78)
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:56)
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:224)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)