[
https://issues.apache.org/jira/browse/PARQUET-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633758#comment-17633758
]
ASF GitHub Bot commented on PARQUET-2213:
-----------------------------------------
steveloughran commented on code in PR #1010:
URL: https://github.com/apache/parquet-mr/pull/1010#discussion_r1021472465
##########
parquet-common/src/main/java/org/apache/parquet/io/InputFile.java:
##########
@@ -41,4 +41,16 @@ public interface InputFile {
*/
SeekableInputStream newStream() throws IOException;
+ /**
+ * Open a new {@link SeekableInputStream} for the underlying data file,
+ * in the range of '[offset, offset + length)'
+ *
+ * @param offset the offset in the file to read from
+ * @param length the total number of bytes to read
+ * @return a new {@link SeekableInputStream} to read the file
+ * @throws IOException if the stream cannot be opened
+ */
+ default SeekableInputStream newStream(long offset, long length) throws
IOException {
Review Comment:
this is what the vectored IO API of hadoop 3.3.5, an API for which we intend
to provide in a compatibility library for apps running against hadoop 3.2.x+.
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md#default-void-readvectoredlist-extends-filerange-ranges-intfunctionbytebuffer-allocate
> Add an alternative InputFile.newStream that allow an input range
> ----------------------------------------------------------------
>
> Key: PARQUET-2213
> URL: https://issues.apache.org/jira/browse/PARQUET-2213
> Project: Parquet
> Issue Type: Improvement
> Reporter: Chao Sun
> Priority: Minor
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)