yihua opened a new pull request, #10020: URL: https://github.com/apache/hudi/pull/10020
### Change Logs This PR adds the support of reading only log files in the file group reader-based Spark parquet file format (`HoodieFileGroupReaderBasedParquetFileFormat`). - In `HoodieFileGroupReaderBasedParquetFileFormat#buildReaderWithPartitionValues`, the record iterator from the new file group reader is returned when there are only log files in a file group. - Fixes the log file iterator to properly project the data based on the required / reader schema in `SparkFileFormatInternalRowReaderContext`. - Adds new tests on read log files only in `TestHoodieFileGroupReaderBase`. ### Impact As above, improves functionality. ### Risk level low ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
