Piotr Nowojski created FLINK-8443:
-------------------------------------
Summary: YARNSessionCapacitySchedulerITCase is flakky
Key: FLINK-8443
URL: https://issues.apache.org/jira/browse/FLINK-8443
Project: Flink
Issue Type: Bug
Components: YARN
Affects Versions: 1.5.0
Reporter: Piotr Nowojski
Attachments: 35.5.tar.gz
Attached build logs from travis.
Test(s) is failing with:
{noformat}
java.lang.AssertionError: Found a file
/home/travis/build/dataArtisans/flink/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-logDir-nm-1_0/application_1516120275777_0003/container_1516120275
777_0003_01_000002/taskmanager.log with a prohibited string (one of [Exception,
Started [email protected]:8081]). Excerpts{noformat}
After downloading the yarn logs uploaded to transfer.sh there is a following
failure:
{code:java}
2018-01-16 16:32:10,553 INFO org.apache.flink.yarn.YarnTaskManager
- Stopping TaskManager with final application status SUCCEEDED and
diagnostics: Flink YARN Client requested shutdown
2018-01-16 16:32:10,577 INFO org.apache.flink.yarn.YarnTaskManager
- Stopping TaskManager akka://flink/user/taskmanager#2122015748.
2018-01-16 16:32:10,578 INFO org.apache.flink.yarn.YarnTaskManager
- Disassociating from JobManager
2018-01-16 16:32:10,588 INFO org.apache.flink.runtime.blob.PermanentBlobCache
- Shutting down BLOB cache
2018-01-16 16:32:10,599 INFO org.apache.flink.runtime.blob.TransientBlobCache
- Shutting down BLOB cache
2018-01-16 16:32:10,614 INFO
org.apache.flink.runtime.io.disk.iomanager.IOManager - I/O manager
removed spill file directory
/home/travis/build/dataArtisans/flink/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/travis/appcache/application_1516120275777_0003/flink-io-356a7c21-a3cd-43cb-926c-7690f861b66c
2018-01-16 16:32:10,615 INFO
org.apache.flink.runtime.io.network.NetworkEnvironment - Shutting down
the network environment and its components.
2018-01-16 16:32:10,619 INFO
org.apache.flink.runtime.io.network.netty.NettyClient - Successful
shutdown (took 4 ms).
2018-01-16 16:32:10,623 INFO
org.apache.flink.runtime.io.network.netty.NettyServer - Successful
shutdown (took 4 ms).
2018-01-16 16:32:10,641 INFO org.apache.flink.yarn.YarnTaskManager
- Task manager akka://flink/user/taskmanager is completely shut
down.
2018-01-16 16:32:10,649 INFO
akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down
remote daemon.
2018-01-16 16:32:10,650 INFO
akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon
shut down; proceeding with flushing remote transports.
2018-01-16 16:32:10,717 WARN
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline -
An exception was thrown by an exception handler.
java.util.concurrent.RejectedExecutionException: Worker has already been
shutdown
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781)
at
org.apache.flink.shaded.akka.org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:54)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:784)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.SimpleChannelHandler.disconnectRequested(SimpleChannelHandler.java:320)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.SimpleChannelHandler.handleDownstream(SimpleChannelHandler.java:274)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.Channels.disconnect(Channels.java:781)
at
org.apache.flink.shaded.akka.org.jboss.netty.channel.AbstractChannel.disconnect(AbstractChannel.java:219)
at
akka.remote.transport.netty.NettyTransport$$anonfun$gracefulClose$1.apply(NettyTransport.scala:241)
at
akka.remote.transport.netty.NettyTransport$$anonfun$gracefulClose$1.apply(NettyTransport.scala:240)
at scala.util.Success.foreach(Try.scala:236)
at scala.concurrent.Future$$anonfun$foreach$1.apply(Future.scala:206)
at scala.concurrent.Future$$anonfun$foreach$1.apply(Future.scala:206)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
at
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at
scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at
akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2018-01-16 16:32:10,755 INFO org.apache.flink.yarn.YarnTaskManagerRunner
- RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.}}
2018-01-16 16:32:10,762 INFO
akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut
down.
2018-01-16 16:32:10,794 INFO org.apache.flink.yarn.YarnTaskManager
- Shutdown completed. Stopping JVM.
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)