[
https://issues.apache.org/jira/browse/FLINK-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15511311#comment-15511311
]
Nagarjun Guraja commented on FLINK-4650:
----------------------------------------
[~StephanEwen] I haven't spent lot of time debugging it on 1.2.SNAPSHOT, but
the stack traces are similar to the one below: (The node was reachable and no
issues with network connectivity)
org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException:
Connection unexpectedly closed by remote task manager
'titus-248496-worker-0-2/100.82.8.187:56858'. This might indicate that the
remote task manager was lost.
at
org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.channelInactive(PartitionRequestClientHandler.java:118)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
at
io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
at
io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:294)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
at
io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:829)
at
io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:610)
at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Do you want us to look for any specific log messages to see what was the root
cause?
> Frequent task manager disconnects from JobManager
> -------------------------------------------------
>
> Key: FLINK-4650
> URL: https://issues.apache.org/jira/browse/FLINK-4650
> Project: Flink
> Issue Type: Bug
> Reporter: Nagarjun Guraja
>
> Not sure of the exact reason but we observe more frequent task manager
> disconnects while using 1.2 snapshot build as compared to 1.1.2 release build
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)