Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/20424#discussion_r165528385
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec:
String, envVars: Map[String
daemon = pb.start()
val in = new DataInputStream(daemon.getInputStream)
- daemonPort = in.readInt()
+ try {
+ daemonPort = in.readInt()
+ } catch {
+ case exc: EOFException =>
+ throw new IOException(s"No port number in $daemonModule's
stdout")
+ }
+
+ // test that the returned port number is within a valid range.
+ // note: this does not cover the case where the port number
+ // is arbitrary data but is also coincidentally within range
+ if (daemonPort < 1 || daemonPort > 0xffff) {
--- End diff --
> Still -- it is clearly safer to instead have the port written to some
other file, or (another) socket, so that you we wouldn't have to worry about
the details of this error handling.
Let's go that route. I will have a PR in the coming days.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]