mbutrovich opened a new issue, #3481:
URL: https://github.com/apache/datafusion-comet/issues/3481

   ### What is the problem the feature request solves?
   
   While working on #3446 I tested implementing a DataSource V2 compatible 
`native_datafusion` scan. I got tests passing, but then realized that Spark's 
DataSource V2 Parquet scan has fewer features than V1, such as [not supporting 
DPP](https://github.com/apache/spark/pull/52180). Maybe Spark implemented the 
V2 Parquet reader to dogfood the V2 Data Source API without external 
dependencies to test an API that was really created for things like Iceberg, 
Delta, etc. 
   
   However I think newer catalog implementations might return V2 DataSource API 
Parquet table references, so we should probably still support.
   
   ### Describe the potential solution
   
   Implement a `CometNativeBatchScanExec` operator that converts Spark 
`BatchScanExec` (with `ParquetScan`)
   
   This should hopefully serialize down to the same proto as 
`CometNativeScanExec` and handled transparently on the native side in 
planner.rs.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to