[
https://issues.apache.org/jira/browse/PARQUET-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712575#comment-17712575
]
Atul Mohan commented on PARQUET-2276:
-------------------------------------
Our community is making an effort to move towards hadoop 3.3.x but it is a bit
challenging currently since we have several users currently on 2.8.x who use
hadoop map reduce tasks to ingest parquet data.
Is there any 2.x version that works with Parquet version 1.13.0? If not, it
would be important to call out in your docs that support for hadoop 2.x has
been dropped.
> ParquetReader reads do not work with Hadoop version 2.8.5
> ---------------------------------------------------------
>
> Key: PARQUET-2276
> URL: https://issues.apache.org/jira/browse/PARQUET-2276
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: 1.13.0
> Reporter: Atul Mohan
> Priority: Major
>
> {{ParquetReader.read() fails with the following exception on parquet-mr
> version 1.13.0 when using hadoop version 2.8.5:}}
> {code:java}
> java.lang.NoSuchMethodError: 'boolean
> org.apache.hadoop.fs.FSDataInputStream.hasCapability(java.lang.String)'
> at
> org.apache.parquet.hadoop.util.HadoopStreams.isWrappedStreamByteBufferReadable(HadoopStreams.java:74)
>
> at org.apache.parquet.hadoop.util.HadoopStreams.wrap(HadoopStreams.java:49)
> at
> org.apache.parquet.hadoop.util.HadoopInputFile.newStream(HadoopInputFile.java:69)
>
> at
> org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:787)
>
> at
> org.apache.parquet.hadoop.ParquetFileReader.open(ParquetFileReader.java:657)
> at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:162)
> org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135)
> {code}
>
>
>
> From an initial investigation, it looks like HadoopStreams has started using
> [FSDataInputStream.hasCapability|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java#L74]
> but _FSDataInputStream_ does not have the _hasCapability_ API in [hadoop
> 2.8.x|https://hadoop.apache.org/docs/r2.8.3/api/org/apache/hadoop/fs/FSDataInputStream.html].
--
This message was sent by Atlassian Jira
(v8.20.10#820010)