Github user bersprockets commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20424#discussion_r165528385
  
    --- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
    @@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
             daemon = pb.start()
     
             val in = new DataInputStream(daemon.getInputStream)
    -        daemonPort = in.readInt()
    +        try {
    +          daemonPort = in.readInt()
    +        } catch {
    +          case exc: EOFException =>
    +            throw new IOException(s"No port number in $daemonModule's 
stdout")
    +        }
    +
    +        // test that the returned port number is within a valid range.
    +        // note: this does not cover the case where the port number
    +        // is arbitrary data but is also coincidentally within range
    +        if (daemonPort < 1 || daemonPort > 0xffff) {
    --- End diff --
    
    > Still -- it is clearly safer to instead have the port written to some 
other file, or (another) socket, so that you we wouldn't have to worry about 
the details of this error handling.
    
    Let's go that route. I will have a PR in the coming days.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to