Felix Schmalzel created PARQUET-1983:
----------------------------------------

             Summary: Pool SeekableInputStreams in ParquetFileReader
                 Key: PARQUET-1983
                 URL: https://issues.apache.org/jira/browse/PARQUET-1983
             Project: Parquet
          Issue Type: New Feature
          Components: parquet-mr
            Reporter: Felix Schmalzel


 

If https://issues.apache.org/jira/browse/PARQUET-1982 goes through, then we 
could allow parallel reading of row groups with a pool of SeekableInputStreams. 
This would significantly boost performance for applications that read data at 
random positions from a large file.

I've already developed a patch that would enable this functionality. I will 
link the merge request in the next few days.

Is there a related ticket that i have overlooked?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to