vanzin opened a new pull request #27084: [SPARK-30225][core] Correct read() behavior past EOF in NioBufferedFileInputStream URL: https://github.com/apache/spark/pull/27084 This bug manifested itself when another stream would potentially make a call to NioBufferedFileInputStream.read() after it had reached EOF in the wrapped stream. In that case, the refill() code would clear the output buffer the first time EOF was found, leaving it in a readable state for subsequent read() calls. If any of those calls were made, bad data would be returned. By flipping the buffer before returning, even in the EOF case, you get the correct behavior in subsequent calls. I picked that approach to avoid keeping more state in this class, although it means calling the underlying stream even after EOF (which is fine, but perhaps a little more expensive). This showed up (at least) when using encryption, because the commons-crypto StreamInput class does not track EOF internally, leaving it for the wrapped stream to behave correctly. Tested with added unit test + slightly modified test case attached to SPARK-18105.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
