wgtmac opened a new pull request #695:
URL: https://github.com/apache/orc/pull/695


   The current implementation of ZlibDecompressionStream::seek and
   BlockDecompressionStream::seek resets the state of the decompressor
   and the underlying file reader and throws away their buffers.
   
   This commit introduces two optimizations which rely on reusing
   the buffers that still contain useful data, and therefore reducing
   the time spent reading/uncompressing the buffers again.
   
   The first case is when the seeked position is already read
   and decompressed into the output stream.
   
   The second case is when the seeked position is already read from
   the input stream, but has not been decompressed yet, ie. it's
   not in the output stream.
   
   Tests:
   - Run the ORC tests, and the Impala tests working on ORC tables.
   - The regression that #476 would cause is not present anymore.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to