Felix Schmalzel created PARQUET-1983:
----------------------------------------
Summary: Pool SeekableInputStreams in ParquetFileReader
Key: PARQUET-1983
URL: https://issues.apache.org/jira/browse/PARQUET-1983
Project: Parquet
Issue Type: New Feature
Components: parquet-mr
Reporter: Felix Schmalzel
If https://issues.apache.org/jira/browse/PARQUET-1982 goes through, then we
could allow parallel reading of row groups with a pool of SeekableInputStreams.
This would significantly boost performance for applications that read data at
random positions from a large file.
I've already developed a patch that would enable this functionality. I will
link the merge request in the next few days.
Is there a related ticket that i have overlooked?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)