[
https://issues.apache.org/jira/browse/PARQUET-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ryan Blue resolved PARQUET-674.
-------------------------------
Resolution: Fixed
Assignee: Ryan Blue
Fix Version/s: 1.9.0
Merged #368. Thanks for reviewing, [~julienledem]!
> Add an abstraction to get the length of a stream
> ------------------------------------------------
>
> Key: PARQUET-674
> URL: https://issues.apache.org/jira/browse/PARQUET-674
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Reporter: Ryan Blue
> Assignee: Ryan Blue
> Fix For: 1.9.0
>
>
> PARQUET-400 introduces {{SeekableInputStream}} to wrap Hadoop v1 and v2
> streams and provide ByteBuffer access transparently. This can also be used as
> an abstraction to allow Parquet to work without the Hadoop API. The missing
> component is an abstraction that knows how long the file stream is for
> reading the footer. This could be done by adding a {{getLength}} method to
> the new stream interface, but I think there is value in adding a higher-level
> abstraction that carries information about the file and can open streams for
> it. This abstraction could be passed to a PageReadStore, which could have
> more complicated logic including parallel streams to read column chunks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)