Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20424#discussion_r165255299 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: String, envVars: Map[String daemon = pb.start() val in = new DataInputStream(daemon.getInputStream) - daemonPort = in.readInt() + try { + daemonPort = in.readInt() + } catch { + case exc: EOFException => + throw new IOException(s"No port number in $daemonModule's stdout") + } + + // test that the returned port number is within a valid range. + // note: this does not cover the case where the port number + // is arbitrary data but is also coincidentally within range + if (daemonPort < 1 || daemonPort > 0xffff) { --- End diff -- ah I see, I think you are worried about something other than what bruce and I thought. Your concern is that we might throw an exception for some values that are actually perfectly legitimate. Port 0 being special is a pretty standard thing -- its mentioned in the constructor for ServerSocket: https://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#ServerSocket%28int%29 which implies that you shouldn't ever open a Socket on port 0, though I don't see that officially documented. At least on my laptop, I get different errors if I try to connect to port 0, vs. just connecting to a bogus port: ```scala scala> val s2 = new Socket("localhost", 1234) java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at java.net.Socket.connect(Socket.java:538) at java.net.Socket.<init>(Socket.java:434) at java.net.Socket.<init>(Socket.java:211) ... 29 elided scala> val s3 = new Socket("localhost", 0) java.net.NoRouteToHostException: Can't assign requested address at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at java.net.Socket.connect(Socket.java:538) at java.net.Socket.<init>(Socket.java:434) at java.net.Socket.<init>(Socket.java:211) ... 29 elided ``` so I think its pretty safe to say that daemon.py (or whatever) shouldn't be passing back `0` as the port to bind to. Still -- it is *clearly* safer to instead have the port written to some other file, or (another) socket, so that you we wouldn't have to worry about the details of this error handling.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org