Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/21346#discussion_r191941503
--- Diff:
common/network-common/src/main/java/org/apache/spark/network/server/RpcHandler.java
---
@@ -38,15 +38,24 @@
*
* This method will not be called in parallel for a single
TransportClient (i.e., channel).
*
+ * The rpc *might* included a data stream in <code>streamData</code>
(eg. for uploading a large
+ * amount of data which should not be buffered in memory here). Any
errors while handling the
+ * streamData will lead to failing this entire connection -- all other
in-flight rpcs will fail.
--- End diff --
Perhaps naive question: what are the implications of this? Is this
referring to a scenario where we've multiplexed multiple asynchronous requests
/ responses over a single network connection? I think I understand _why_ the
failure mode is as stated (we're worried about leaving non-consumed leftover
data in the channel) but I just wanted to ask about the implications of failing
other in-flight RPCs.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]