andygrove opened a new pull request, #3831: URL: https://github.com/apache/datafusion-comet/pull/3831
## Which issue does this PR close? Closes #3312, closes #3313, closes #3314, closes #3315, closes #3320, closes #3401. ## Rationale for this change Several tests in the 3.5.8 Spark SQL test diff were tagged with `IgnoreCometNativeDataFusion` but actually pass when run with `COMET_PARQUET_SCAN_IMPL=native_datafusion`. These tags were only present in the 3.5.8 diff and not in the 3.4.3 or 4.0.1 diffs, suggesting they were unnecessarily restrictive. ## What changes are included in this PR? **Removed `IgnoreCometNativeDataFusion` from tests that pass** (verified with `COMET_PARQUET_SCAN_IMPL=native_datafusion`): - `ColumnExpressionSuite`: `input_file_name, input_file_block_start, input_file_block_length - FileScanRDD` (#3312) - `UDFSuite`: `SPARK-8005 input_file_name` (#3312) - `HiveUDFSuite`: `SPARK-11522 select input_file_name from non-parquet table` (#3312) - `ExplainSuite`: `explain formatted - check presence of subquery in case of DPP` (#3313) - `SQLViewSuite`: `alter temporary view should follow current storeAnalyzedPlanForView config` (#3314) - `FileDataSourceV2FallBackSuite`: `Fallback Parquet V2 to V1` (#3315) - `StreamingQuerySuite`: `SPARK-41198` and `SPARK-41199` (#3315) - `ParquetFilterSuite`: `SPARK-31026` and `Filters should be pushed down for Parquet readers at row group level` (#3320) - `StreamingSelfUnionSuite`: 2 self-union DSv1 tests (#3401) **Fixed `ExtractPythonUDFsSuite`** to match `CometNativeScanExec` in plan node pattern matches, allowing the Python UDF column pruning/filter pushdown test to pass with native DataFusion scan (#3312). **Updated `DynamicPartitionPruningSuite`** issue reference from #3313 to #3442 for consistency with the 3.4.3 and 4.0.1 diffs. **Kept `IgnoreCometNativeDataFusion`** on bucketed read/scan tests (#3319) as they require `getFileScan`/`getBucketScan` helper updates to support `CometNativeScanExec`. ## How are these changes tested? Each test was run individually with `ENABLE_COMET=true ENABLE_COMET_ONHEAP=true COMET_PARQUET_SCAN_IMPL=native_datafusion` against Apache Spark 3.5.8 with the updated diff applied. All tests listed as removed passed successfully. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
