[
https://issues.apache.org/jira/browse/PARQUET-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated PARQUET-2151:
------------------------------------
Summary: Drop Hadoop 1 input stream support from parquet-hadoop (was:
parquet-hadoop to drop Hadoop 1 input stream support)
> Drop Hadoop 1 input stream support from parquet-hadoop
> -------------------------------------------------------
>
> Key: PARQUET-2151
> URL: https://issues.apache.org/jira/browse/PARQUET-2151
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Affects Versions: 1.13.0
> Reporter: Steve Loughran
> Priority: Minor
>
> Parquet uses reflection to load a hadoop2 input stream, falling back to a
> hadoop-1 compatible client if not found.
> All hadoop 2.0.2+ releases work with H2SeekableInputStream, so
> H1SeekableInputStream can be cut and the binding to H2SeekableInputStream
> reworked to avoid needing reflection. This would make it a lot easier to
> probe for/use the bytebuffer input, and line the code up for more recent
> hadoop releases.
> One thing H1SeekableInputStream does do is read into a temp array if the
> FSDataInputStream doesn't support , that is, doesn't implement
> ByteBufferReadable.
> but FSDataInputStream simply forwards that to the inner stream, if it too
> implements ByteBufferReadable. Filesystems which don't (the cloud stores)
> can't be read through H2SeekableInputStream.read(ByteBufferReadable). If this
> desired, H2SeekableInputStream will need to dynamically downgrade to
> DelegatingSeekableInputStream's base methods if a call to
> FSDataInputStream.read(ByteBuffer) fails.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)