sadikovi commented on a change in pull request #34149:
URL: https://github.com/apache/spark/pull/34149#discussion_r719062009
##########
File path:
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java
##########
@@ -320,7 +331,7 @@ private void initializeInternal() throws IOException,
UnsupportedOperationExcept
private void checkEndOfRowGroup() throws IOException {
if (rowsReturned != totalCountLoadedSoFar) return;
- PageReadStore pages = reader.readNextFilteredRowGroup();
+ PageReadStore pages = reader.readNextRowGroup();
Review comment:
Could you elaborate? Does `readNextRowGroup()` handle filtered row
groups and column indexes, i.e. would it call `readNextFilteredRowGroup()` if
column index is enabled?
##########
File path:
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java
##########
@@ -222,4 +243,32 @@ public void close() throws IOException {
throw new BadConfigurationException("could not instantiate read support
class", e);
}
}
+
+ interface ParquetRowGroupReader extends Closeable {
+ /**
+ * Reads the next row group from this reader. Returns null if there is no
more row group.
+ */
+ PageReadStore readNextRowGroup() throws IOException;
+ }
+
+ private static class ParquetRowGroupReaderImpl implements
ParquetRowGroupReader {
+ private final ParquetFileReader reader;
+
+ ParquetRowGroupReaderImpl(ParquetFileReader reader) {
+ this.reader = reader;
+ }
+
+ @Override
+ public PageReadStore readNextRowGroup() throws IOException {
+ return reader.readNextFilteredRowGroup();
+ }
+
+ @Override
+ public void close() throws IOException {
+ if (reader != null) {
+ reader.close();
+ }
+ }
+ }
+
Review comment:
My OCD again: would it be possible to remove the new line? 😄
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]