[jira] [Commented] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084195#comment-14084195 ] Apache Spark commented on SPARK-2583: - User 'JoshRosen' has created a pull request for this issue: https://github.com/apache/spark/pull/1758 ConnectionManager cannot distinguish whether error occurred or not -- Key: SPARK-2583 URL: https://issues.apache.org/jira/browse/SPARK-2583 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Kousuke Saruta Assignee: Kousuke Saruta Priority: Critical ConnectionManager#handleMessage sent empty messages to another peer if some error occurred or not in onReceiveCalback. {code} val ackMessage = if (onReceiveCallback != null) { logDebug(Calling back) onReceiveCallback(bufferMessage, connectionManagerId) } else { logDebug(Not calling back as callback is null) None } if (ackMessage.isDefined) { if (!ackMessage.get.isInstanceOf[BufferMessage]) { logDebug(Response to + bufferMessage + is not a buffer message, it is of type + ackMessage.get.getClass) } else if (!ackMessage.get.asInstanceOf[BufferMessage].hasAckId) { logDebug(Response to + bufferMessage + does not have ack id set) ackMessage.get.asInstanceOf[BufferMessage].ackId = bufferMessage.id } } // We have no way to tell peer whether error occurred or not sendMessage(connectionManagerId, ackMessage.getOrElse { Message.createBufferMessage(bufferMessage.id) }) } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073419#comment-14073419 ] Kousuke Saruta commented on SPARK-2583: --- I have added some test cases to my PR for this issue. ConnectionManager cannot distinguish whether error occurred or not -- Key: SPARK-2583 URL: https://issues.apache.org/jira/browse/SPARK-2583 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Kousuke Saruta Assignee: Kousuke Saruta Priority: Critical ConnectionManager#handleMessage sent empty messages to another peer if some error occurred or not in onReceiveCalback. {code} val ackMessage = if (onReceiveCallback != null) { logDebug(Calling back) onReceiveCallback(bufferMessage, connectionManagerId) } else { logDebug(Not calling back as callback is null) None } if (ackMessage.isDefined) { if (!ackMessage.get.isInstanceOf[BufferMessage]) { logDebug(Response to + bufferMessage + is not a buffer message, it is of type + ackMessage.get.getClass) } else if (!ackMessage.get.asInstanceOf[BufferMessage].hasAckId) { logDebug(Response to + bufferMessage + does not have ack id set) ackMessage.get.asInstanceOf[BufferMessage].ackId = bufferMessage.id } } // We have no way to tell peer whether error occurred or not sendMessage(connectionManagerId, ackMessage.getOrElse { Message.createBufferMessage(bufferMessage.id) }) } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068240#comment-14068240 ] Patrick Wendell commented on SPARK-2583: Hey [~sarutak] - I'm curious - what is the behavior you were seeing without this patch? ConnectionManager cannot distinguish whether error occurred or not -- Key: SPARK-2583 URL: https://issues.apache.org/jira/browse/SPARK-2583 Project: Spark Issue Type: Bug Reporter: Kousuke Saruta Assignee: Kousuke Saruta Priority: Critical ConnectionManager#handleMessage sent empty messages to another peer if some error occurred or not in onReceiveCalback. {code} val ackMessage = if (onReceiveCallback != null) { logDebug(Calling back) onReceiveCallback(bufferMessage, connectionManagerId) } else { logDebug(Not calling back as callback is null) None } if (ackMessage.isDefined) { if (!ackMessage.get.isInstanceOf[BufferMessage]) { logDebug(Response to + bufferMessage + is not a buffer message, it is of type + ackMessage.get.getClass) } else if (!ackMessage.get.asInstanceOf[BufferMessage].hasAckId) { logDebug(Response to + bufferMessage + does not have ack id set) ackMessage.get.asInstanceOf[BufferMessage].ackId = bufferMessage.id } } // We have no way to tell peer whether error occurred or not sendMessage(connectionManagerId, ackMessage.getOrElse { Message.createBufferMessage(bufferMessage.id) }) } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068814#comment-14068814 ] Kousuke Saruta commented on SPARK-2583: --- Hi [~pwendell], When I simulate disk fault on shuffle, I saw following 2 behaviors. #1 is related to this topic and #2 is about this topic. 1) Fetching from executor locally itself To simulate a case of disk fault, I deleted bucket file. FileNotFoundException was thrown and after retry, 2) ConnectionManager cannot distinguish whether error occurred or not -- Key: SPARK-2583 URL: https://issues.apache.org/jira/browse/SPARK-2583 Project: Spark Issue Type: Bug Reporter: Kousuke Saruta Assignee: Kousuke Saruta Priority: Critical ConnectionManager#handleMessage sent empty messages to another peer if some error occurred or not in onReceiveCalback. {code} val ackMessage = if (onReceiveCallback != null) { logDebug(Calling back) onReceiveCallback(bufferMessage, connectionManagerId) } else { logDebug(Not calling back as callback is null) None } if (ackMessage.isDefined) { if (!ackMessage.get.isInstanceOf[BufferMessage]) { logDebug(Response to + bufferMessage + is not a buffer message, it is of type + ackMessage.get.getClass) } else if (!ackMessage.get.asInstanceOf[BufferMessage].hasAckId) { logDebug(Response to + bufferMessage + does not have ack id set) ackMessage.get.asInstanceOf[BufferMessage].ackId = bufferMessage.id } } // We have no way to tell peer whether error occurred or not sendMessage(connectionManagerId, ackMessage.getOrElse { Message.createBufferMessage(bufferMessage.id) }) } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067309#comment-14067309 ] Kousuke Saruta commented on SPARK-2583: --- PR: https://github.com/apache/spark/pull/1490 ConnectionManager cannot distinguish whether error occurred or not -- Key: SPARK-2583 URL: https://issues.apache.org/jira/browse/SPARK-2583 Project: Spark Issue Type: Bug Reporter: Kousuke Saruta ConnectionManager#handleMessage sent empty messages to another peer if some error occurred or not in onReceiveCalback. {code} val ackMessage = if (onReceiveCallback != null) { logDebug(Calling back) onReceiveCallback(bufferMessage, connectionManagerId) } else { logDebug(Not calling back as callback is null) None } if (ackMessage.isDefined) { if (!ackMessage.get.isInstanceOf[BufferMessage]) { logDebug(Response to + bufferMessage + is not a buffer message, it is of type + ackMessage.get.getClass) } else if (!ackMessage.get.asInstanceOf[BufferMessage].hasAckId) { logDebug(Response to + bufferMessage + does not have ack id set) ackMessage.get.asInstanceOf[BufferMessage].ackId = bufferMessage.id } } // We have no way to tell peer whether error occurred or not sendMessage(connectionManagerId, ackMessage.getOrElse { Message.createBufferMessage(bufferMessage.id) }) } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)