andygrove opened a new pull request, #3687:
URL: https://github.com/apache/datafusion-comet/pull/3687

   ## Which issue does this PR close?
   
   Related to #3682.
   
   ## Rationale for this change
   
   When `spark.sql.caseSensitive=false` (the default) and a Parquet schema 
contains field names that collide after lowercasing (e.g., `Name` and `name`), 
DataFusion produces different error messages than Spark. This causes the 
`SPARK-25207: exception when duplicate fields in case-insensitive mode` 
spark-sql test to fail when using `native_datafusion`.
   
   ## What changes are included in this PR?
   
   Adds a guard in `nativeDataFusionScan()` that detects duplicate field names 
(after lowercasing) in the required schema when case-insensitive analysis is 
enabled. When duplicates are found, the scan falls back to avoid incompatible 
error behavior.
   
   ## How are these changes tested?
   
   Covered by the existing `SPARK-25207` test in the spark-sql test suite, 
which verifies the correct error behavior for duplicate fields in 
case-insensitive mode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to