[ 
https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068814#comment-14068814
 ] 

Kousuke Saruta commented on SPARK-2583:
---------------------------------------

Hi [~pwendell], 
When I simulate disk fault on shuffle, I saw following 2 behaviors.
#1 is related to this topic and #2 is about this topic.

1) Fetching from executor locally itself
To simulate a case of disk fault, I deleted bucket file.
FileNotFoundException was thrown and 
after retry, 

2)

> ConnectionManager cannot distinguish whether error occurred or not
> ------------------------------------------------------------------
>
>                 Key: SPARK-2583
>                 URL: https://issues.apache.org/jira/browse/SPARK-2583
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Kousuke Saruta
>            Assignee: Kousuke Saruta
>            Priority: Critical
>
> ConnectionManager#handleMessage sent empty messages to another peer if some 
> error occurred or not in onReceiveCalback.
> {code}
>          val ackMessage = if (onReceiveCallback != null) {
>             logDebug("Calling back")
>             onReceiveCallback(bufferMessage, connectionManagerId)
>           } else {
>             logDebug("Not calling back as callback is null")
>             None
>           }
>           if (ackMessage.isDefined) {
>             if (!ackMessage.get.isInstanceOf[BufferMessage]) {
>               logDebug("Response to " + bufferMessage + " is not a buffer 
> message, it is of type "
>                 + ackMessage.get.getClass)
>             } else if (!ackMessage.get.asInstanceOf[BufferMessage].hasAckId) {
>               logDebug("Response to " + bufferMessage + " does not have ack 
> id set")
>               ackMessage.get.asInstanceOf[BufferMessage].ackId = 
> bufferMessage.id
>             }
>           }
>         // We have no way to tell peer whether error occurred or not
>           sendMessage(connectionManagerId, ackMessage.getOrElse {
>             Message.createBufferMessage(bufferMessage.id)
>           })
>         }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to