Till Rohrmann created FLINK-11631:
-------------------------------------
Summary:
TaskExecutorITCase#testJobReExecutionAfterTaskExecutorTermination unstable on
Travis
Key: FLINK-11631
URL: https://issues.apache.org/jira/browse/FLINK-11631
Project: Flink
Issue Type: Bug
Components: Distributed Coordination, Tests
Affects Versions: 1.8.0
Reporter: Till Rohrmann
The {{TaskExecutorITCase#testJobReExecutionAfterTaskExecutorTermination}} is
unstable on Travis. It fails with
{code}
16:12:04.644 [ERROR]
testJobReExecutionAfterTaskExecutorTermination(org.apache.flink.runtime.taskexecutor.TaskExecutorITCase)
Time elapsed: 1.257 s <<< ERROR!
org.apache.flink.util.FlinkException: Could not close resource.
at
org.apache.flink.runtime.taskexecutor.TaskExecutorITCase.teardown(TaskExecutorITCase.java:83)
Caused by: org.apache.flink.util.FlinkException: Error while shutting the
TaskExecutor down.
Caused by: org.apache.flink.util.FlinkException: Could not properly shut down
the TaskManager services.
Caused by: java.lang.IllegalStateException: NetworkBufferPool is not empty
after destroying all LocalBufferPools
{code}
https://api.travis-ci.org/v3/job/493221318/log.txt
The problem seems to be caused by the {{TaskExecutor}} not properly waiting for
the termination of all running {{Tasks}}. Due to this, there is a race
condition which causes that not all buffers are returned to the {{BufferPool}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)