[
https://issues.apache.org/jira/browse/PARQUET-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903224#comment-16903224
]
Zherui Cao commented on PARQUET-1636:
-------------------------------------
The raw file position is changed in columnReader::NextPage(), because of the
call of BufferedInputStream->Peek().
Do you think it would work to change the type of BufferedInputStream::raw_() to
RandomAccessFile, so it can use ReadAt
> [C++] Incompatibility due to moving from Parquet to Arrow IO interfaces
> -----------------------------------------------------------------------
>
> Key: PARQUET-1636
> URL: https://issues.apache.org/jira/browse/PARQUET-1636
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-cpp
> Reporter: Deepak Majeti
> Assignee: Wes McKinney
> Priority: Major
>
> We moved to the Arrow IO interfaces as part of
> https://issues.apache.org/jira/browse/PARQUET-1422
> However, the BufferedInputStream implementations between Parquet and Arrow
> are different.
> Parquet's BufferedInputStream used to takes a RandomAccessSource. Arrow's
> implementation takes an InputStream. As a result, the
> {{::arrow::io::BufferedInputStream::Peek(which invokes Read())}}
> implementation causes the raw source (input to {{BufferedInputStream}}) to
> change its offset on Peek(). This did not happen in the Parquet's
> BufferedInputStream implementation.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)