mbutrovich opened a new issue, #3481: URL: https://github.com/apache/datafusion-comet/issues/3481
### What is the problem the feature request solves? While working on #3446 I tested implementing a DataSource V2 compatible `native_datafusion` scan. I got tests passing, but then realized that Spark's DataSource V2 Parquet scan has fewer features than V1, such as [not supporting DPP](https://github.com/apache/spark/pull/52180). Maybe Spark implemented the V2 Parquet reader to dogfood the V2 Data Source API without external dependencies to test an API that was really created for things like Iceberg, Delta, etc. However I think newer catalog implementations might return V2 DataSource API Parquet table references, so we should probably still support. ### Describe the potential solution Implement a `CometNativeBatchScanExec` operator that converts Spark `BatchScanExec` (with `ParquetScan`) This should hopefully serialize down to the same proto as `CometNativeScanExec` and handled transparently on the native side in planner.rs. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
