[
https://issues.apache.org/jira/browse/FLINK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jiayi Liao reopened FLINK-5463:
-------------------------------
Reopen this because I met the problem on 1.9 when the disk of the machine is
broken. Here is the stack :
{code:java}
"GroupWindowAggregate -> SinkConversionToRow -> Sink_Kafka010TableSi
(1925/2000)- execution # 4" #1095 prio=5 os_prio=0 tid=0x00007f5a1db5c000
nid=0xc4b58 runnable [0x00007f5a237a2000]
java.lang.Thread.State: RUNNABLE
at org.rocksdb.RocksDB.disposeInternal(Native Method)
at org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
at
org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:57)
at org.apache.flink.util.IOUtils.closeQuietly(IOUtils.java:263)
at
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:349)
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:371)
at
org.apache.flink.table.runtime.operators.window.WindowOperator.dispose(WindowOperator.java:372)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:618)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:517)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:733)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:539)
at java.lang.Thread.run(Thread.java:748)
{code}
> RocksDB.disposeInternal does not react to interrupts, blocks task cancellation
> ------------------------------------------------------------------------------
>
> Key: FLINK-5463
> URL: https://issues.apache.org/jira/browse/FLINK-5463
> Project: Flink
> Issue Type: Bug
> Components: Runtime / State Backends
> Affects Versions: 1.2.0
> Reporter: Robert Metzger
> Priority: Major
>
> I'm using Flink 699f4b0.
> My Flink job is slow while cancelling because RockDB seems to be busy with
> disposing its state:
> {code}
> 2017-01-11 18:48:23,315 INFO org.apache.flink.runtime.taskmanager.Task
> - Triggering cancellation of task code
> TriggerWindow(TumblingEventTimeWindows(4),
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071
> }, EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440))
> (1/1) (2accc6ca2727c4f7ec963318fbd237e9).
> 2017-01-11 18:48:53,318 WARN org.apache.flink.runtime.taskmanager.Task
> - Task 'TriggerWindow(TumblingEventTimeWindows(4),
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
> EventTimeTrigger(), Windowed
> Stream.apply(AllWindowedStream.java:440)) (1/1)' did not react to cancelling
> signal, but is stuck in method:
> org.rocksdb.RocksDB.disposeInternal(Native Method)
> org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
> org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.dispose(WindowOperator.java:273)
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:439)
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:340)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
> java.lang.Thread.run(Thread.java:745)
> 2017-01-11 18:48:53,319 WARN org.apache.flink.runtime.taskmanager.Task
> - Task 'TriggerWindow(TumblingEventTimeWindows(4),
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
> EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1)'
> did not react to cancelling signal, but is stuck in method:
> org.rocksdb.RocksDB.disposeInternal(Native Method)
> org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
> org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.dispose(WindowOperator.java:273)
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:439)
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:340)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
> java.lang.Thread.run(Thread.java:745)
> 2017-01-11 18:49:23,319 WARN org.apache.flink.runtime.taskmanager.Task
> - Task 'TriggerWindow(TumblingEventTimeWindows(4),
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
> EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1)'
> did not react to cancelling signal, but is stuck in method:
> org.rocksdb.RocksDB.disposeInternal(Native Method)
> org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
> org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.dispose(WindowOperator.java:273)
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:439)
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:340)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
> java.lang.Thread.run(Thread.java:745)
> 2017-01-11 18:49:50,080 INFO org.apache.flink.runtime.taskmanager.Task
> - Freeing task resources for
> TriggerWindow(TumblingEventTimeWindows(4),
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
> EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1)
> (2accc6ca2727c4f7ec963318fbd237e9)
> {code}
> I'm filing this issue because I didn't see such a behavior in Flink 1.1. I
> guess Flink's code should be well behaved when it comes to cancelling tasks.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)