[
https://issues.apache.org/jira/browse/SPARK-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002505#comment-15002505
]
Thomas Graves commented on SPARK-11701:
---------------------------------------
I'm not exactly sure if this is same issue but trying this on a version of 1.6
(not latest) after running a wordcount job it we get a bunch of errors and it
shuts down the SparkContext..
15/11/12 17:30:49 ERROR TransportChannelHandler: Connection to
gsbl544n27.blue.ygrid.yahoo.com/10.213.42.242:33217 has been quiet for 120000
ms while there are outstanding requests. Assuming connection is dead; please
adjust spark.network.timeout if this is wrong.
15/11/12 17:30:49 ERROR TransportResponseHandler: Still have 15 requests
outstanding when connection from
gsbl544n27.blue.ygrid.yahoo.com/10.213.42.242:33217 is closed
15/11/12 17:30:49 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to
get executor loss reason for executor id 2 at RPC address
gsbl536n11.blue.ygrid.yahoo.com:47496, but got no response. Marking as slave
lost.
java.io.IOException: Connection from
gsbl544n27.blue.ygrid.yahoo.com/10.213.42.242:33217 closed
at
org.apache.spark.network.client.TransportResponseHandler.channelUnregistered(TransportResponseHandler.java:104)
at
org.apache.spark.network.server.TransportChannelHandler.channelUnregistered(TransportChannelHandler.java:91)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:158)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelUnregistered(AbstractChannelHandlerContext.java:144)
at
io.netty.channel.ChannelInboundHandlerAdapter.channelUnregistered(ChannelInboundHandlerAdapter.java:53)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:158)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelUnregistered(AbstractChannelHandlerContext.java:144)
at
io.netty.channel.ChannelInboundHandlerAdapter.channelUnregistered(ChannelInboundHandlerAdapter.java:53)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:158)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelUnregistered(AbstractChannelHandlerContext.java:144)
at
io.netty.channel.ChannelInboundHandlerAdapter.channelUnregistered(ChannelInboundHandlerAdapter.java:53)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:158)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelUnregistered(AbstractChannelHandlerContext.java:144)
at
io.netty.channel.DefaultChannelPipeline.fireChannelUnregistered(DefaultChannelPipeline.java:739)
at
io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:659)
at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
> YARN - dynamic allocation and speculation active task accounting wrong
> ----------------------------------------------------------------------
>
> Key: SPARK-11701
> URL: https://issues.apache.org/jira/browse/SPARK-11701
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.5.1
> Reporter: Thomas Graves
> Priority: Critical
>
> I am using dynamic container allocation and speculation and am seeing issues
> with the active task accounting. The Executor UI still shows active tasks on
> the an executor but the job/stage is all completed. I think its also
> affecting the dynamic allocation being able to release containers because it
> thinks there are still tasks.
> Its easily reproduce by using spark-shell, turn on dynamic allocation, then
> run just a wordcount on decent sized file and set the speculation parameters
> low:
> spark.dynamicAllocation.enabled true
> spark.shuffle.service.enabled true
> spark.dynamicAllocation.maxExecutors 10
> spark.dynamicAllocation.minExecutors 2
> spark.dynamicAllocation.initialExecutors 10
> spark.dynamicAllocation.executorIdleTimeout 40s
> $SPARK_HOME/bin/spark-shell --conf spark.speculation=true --conf
> spark.speculation.multiplier=0.2 --conf spark.speculation.quantile=0.1
> --master yarn --deploy-mode client --executor-memory 4g --driver-memory 4g
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]