[ 
https://issues.apache.org/jira/browse/PARQUET-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632431#comment-17632431
 ] 

ASF GitHub Bot commented on PARQUET-2213:
-----------------------------------------

wgtmac commented on code in PR #1010:
URL: https://github.com/apache/parquet-mr/pull/1010#discussion_r1020335511


##########
parquet-common/src/main/java/org/apache/parquet/io/InputFile.java:
##########
@@ -41,4 +41,16 @@ public interface InputFile {
    */
   SeekableInputStream newStream() throws IOException;
 
+  /**
+   * Open a new {@link SeekableInputStream} for the underlying data file,
+   * in the range of '[offset, offset + length)'
+   *
+   * @param offset the offset in the file to read from
+   * @param length the total number of bytes to read
+   * @return a new {@link SeekableInputStream} to read the file
+   * @throws IOException if the stream cannot be opened
+   */
+  default SeekableInputStream newStream(long offset, long length) throws 
IOException {

Review Comment:
   If we need to read multiple part of a parquet file (e.g. different row 
groups, page index, footer, etc.), should we call it multiple times for each 
individual part? 





> Add an alternative InputFile.newStream that allow an input range
> ----------------------------------------------------------------
>
>                 Key: PARQUET-2213
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2213
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: Chao Sun
>            Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to