Github user ash211 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1845#discussion_r17213316
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1421,3 +1421,24 @@ private[spark] object Utils extends Logging {
       }
     
     }
    +
    +/**
    + * A utility class to redirect the child process's stdout or stderr.
    + */
    +private[spark] class RedirectThread(in: InputStream, out: OutputStream, 
name: String)
    +  extends Thread(name) {
    +
    +  setDaemon(true)
    +  override def run() {
    +    scala.util.control.Exception.ignoring(classOf[IOException]) {
    +      // FIXME: We copy the stream on the level of bytes to avoid encoding 
problems.
    --- End diff --
    
    I'm guessing the original author was reading Strings() before, which 
requires that you define how to interpret the bytes into strings (encodings 
like ASCII/UTF8, etc).  There were probably some byte sequences coming through 
that weren't characters in UTF8 so exceptions were being thrown.  Since this is 
just reading from an InputStream and writing to an OutputStream, copying bytes 
would be much more efficient than reading bytes, interpreting as characters, 
converting back to bytes, then sending those out the other side.
    
    Conveniently, Apache has an ```IOUtils.copy(inputStream, outputStream, 
bufferSize)``` 
[method](http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/IOUtils.html#copy(java.io.InputStream,
 java.io.OutputStream, int)) that would do exactly this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to