[I] No need to read full RowGroup when using RowRanges [parquet-java]

via GitHub Fri, 30 May 2025 01:35:01 -0700


olivbrau opened a new issue, #3231:
URL: https://github.com/apache/parquet-java/issues/3231


   ### Describe the enhancement requested
   
   I'm using ParquetFileReader and would like to read the few first line of a 
parquet file to show a preview to the user.
   I'm doing this :
   PageReadStore rowGroup = reader.readFilteredRowGroup(0, 
RowRanges.createSingle(10));
   and then :
   ColumnReadStore colReadStore = new ColumnReadStoreImpl(rowGroup,
                                                                        new 
GroupRecordConverter(schema).getRootConverter(),
                                                                        
schemaLecture, ...);
   etc.
   
   The problem is when calling readFilteredRowGroup(), it reads the full row 
group in memory, which in my case is slow because the parquet file is on a 
network drive and has quite big row group. On top of that, it consumes RAM for 
nothing.
   
   Is this an issue or could it be improved ?
   
   Thanks in advance.
   Olivier
   
   ### Component(s)
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] No need to read full RowGroup when using RowRanges [parquet-java]

Reply via email to