KaiXinXIaoLei created SPARK-11298:
-------------------------------------

             Summary: When driver sends message "GetExecutorLossReason" to AM, 
the AM stops.
                 Key: SPARK-11298
                 URL: https://issues.apache.org/jira/browse/SPARK-11298
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 1.6.0
            Reporter: KaiXinXIaoLei
             Fix For: 1.6.0, 2+


I get lastest code form github, and just run "bin/spark-shell --master yarn 
--conf spark.dynamicAllocation.enabled=true --conf 
spark.dynamicAllocation.initialExecutors=1 --conf 
spark.shuffle.service.enabled=true". There is error infor:
15/10/25 12:11:02 ERROR TransportChannelHandler: Connection to 
/9.96.1.113:35066 has been quiet for 120000 ms while there are outstanding 
requests. Assuming connection is dead; please adjust spark.network.timeout if 
this is wrong.
15/10/25 12:11:02 ERROR TransportResponseHandler: Still have 1 requests 
outstanding when connection from vm113/9.96.1.113:35066 is closed
15/10/25 12:11:02 WARN NettyRpcEndpointRef: Ignore message 
Failure(java.io.IOException: Connection from vm113/9.96.1.113:35066 closed)
15/10/25 12:11:02 ERROR YarnScheduler: Lost executor 1 on vm111: Slave lost

>From log, when driver sends message "GetExecutorLossReason" to AM, the error 
>appears. From code, i think AM gets this message, should reply.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to