Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/20424#discussion_r164765247
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec:
String, envVars: Map[String
daemon = pb.start()
val in = new DataInputStream(daemon.getInputStream)
- daemonPort = in.readInt()
+ try {
+ daemonPort = in.readInt()
+ } catch {
+ case exc: EOFException =>
+ throw new IOException(s"No port number in $daemonModule's
stdout")
+ }
+
+ // test that the returned port number is within a valid range.
+ // note: this does not cover the case where the port number
+ // is arbitrary data but is also coincidentally within range
+ if (daemonPort < 1 || daemonPort > 0xffff) {
--- End diff --
yeah this is kind of what I was getting at below -- what value are we
adding with this extra handling, over the original exception?
Another possibility is to change this to not use stdout, but that adds more
complexity. You could use sockets, or right the port to some dedicated
temporary file.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]