squito commented on a change in pull request #23453: [SPARK-26089][CORE] Handle
corruption in large shuffle blocks
URL: https://github.com/apache/spark/pull/23453#discussion_r264268057
##########
File path: core/src/main/scala/org/apache/spark/util/Utils.scala
##########
@@ -337,6 +338,44 @@ private[spark] object Utils extends Logging {
}
}
+ /**
+ * Copy all data from an InputStream to an OutputStream upto maxSize and
+ * close the input stream if all data is read.
+ * @return A tuple of boolean, which is whether the stream was fully copied,
and an InputStream,
+ * which is a combined stream of read data and any remaining data
Review comment:
this doc needs updating now. Something like
```
Copy the first `maxSize` bytes of data from the InputStream to an in-memory
buffer, while still exposing the entire original input stream, primarily to
check
for corruption.
This returns a new InputStream which contains the same data as the original
input stream. It may be entirely on an in-memory buffer, or it may be a
combination
of of in-memory data, and then continue to read from the original stream.
The only real
use of this is if the original input stream will potentially detect
corruption while the data
is being read (eg. from compression). This allows for an eager check of
corruption in
the first maxSize bytes of data.
@return A tuple of boolean, which is whether the stream was fully copied,
and an
InputStream which includes all data from the original stream (combining
buffered data
and remaining data in the original stream)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]