Github user jose-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/21428#discussion_r194962850
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/shuffle/ContinuousShuffleReadRDD.scala
---
@@ -34,8 +34,10 @@ case class ContinuousShuffleReadPartition(
// Initialized only on the executor, and only once even as we call
compute() multiple times.
lazy val (reader: ContinuousShuffleReader, endpoint) = {
val env = SparkEnv.get.rpcEnv
- val receiver = new UnsafeRowReceiver(queueSize, numShuffleWriters,
epochIntervalMs, env)
- val endpoint =
env.setupEndpoint(s"UnsafeRowReceiver-${UUID.randomUUID()}", receiver)
+ val receiver = new RPCContinuousShuffleReader(
+ queueSize, numShuffleWriters, epochIntervalMs, env)
+ val endpoint =
env.setupEndpoint(s"RPCContinuousShuffleReader-${UUID.randomUUID()}", receiver)
--- End diff --
It requires a reasonable amount of extra code. As mentioned, this is not
the final shuffle mechanism (and I intend to have the TCP-based shuffle ready
to go in the next Spark release).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]