Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/22085#discussion_r213032992
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -180,7 +188,73 @@ private[spark] abstract class BasePythonRunner[IN,
OUT](
dataOut.writeInt(partitionIndex)
// Python version of driver
PythonRDD.writeUTF(pythonVer, dataOut)
+ // Init a ServerSocket to accept method calls from Python side.
+ val isBarrier = context.isInstanceOf[BarrierTaskContext]
+ if (isBarrier) {
+ serverSocket = Some(new ServerSocket(/* port */ 0,
+ /* backlog */ 1,
+ InetAddress.getByName("localhost")))
+ // A call to accept() for ServerSocket shall block infinitely.
+ serverSocket.map(_.setSoTimeout(0))
+ new Thread("accept-connections") {
+ setDaemon(true)
+
+ override def run(): Unit = {
+ while (!serverSocket.get.isClosed()) {
+ var sock: Socket = null
+ try {
+ sock = serverSocket.get.accept()
+ // Wait for function call from python side.
+ sock.setSoTimeout(10000)
+ val input = new DataInputStream(sock.getInputStream())
--- End diff --
why is authentication the first thing which happens on this connection? I
don't think anything bad can happen in this case, but it just makes it more
likely we leave a security hole here later on.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]