jonvex opened a new pull request, #10954: URL: https://github.com/apache/hudi/pull/10954
### Change Logs Subtask of https://issues.apache.org/jira/browse/HUDI-7045 Extracts from https://github.com/apache/hudi/pull/10278 Spark parquet readers are created per partition. We want to create a reader for each file. This pr ports over the spark readers for each version and removes the partition iterator. To verify the ported code, I have listed the ported spark version in the javadoc for readParquetFile You can use the following link and switch between tags to see the code for that spark version https://github.com/apache/spark/blob/v2.4.8/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala ### Impact Subtask for schema evolution support in new fg reader ### Risk level (write none, low medium or high below) low ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
