kosiew opened a new issue, #19842:
URL: https://github.com/apache/datafusion/issues/19842

   
   ## Problem
   
   This is a **design discussion** issue regarding whether DataFusion should 
adopt **case-insensitive field matching** when [casting between 
structs.](https://github.com/apache/datafusion/pull/19674)
   
   ### Current Behavior
   DataFusion uses **case-sensitive** field name matching. For example:
   - Field `x` and field `X` are treated as different fields
   - A cast from `struct<x int, y int>` to `struct<X int, Y int>` would fail 
(no name overlap)
   
   ### DuckDB Behavior
   DuckDB uses **case-insensitive** field name matching:
   - Field `x` and field `X` are treated as the same field
   - The same cast would succeed, with fields matched case-insensitively
   
   ## Motivation for Case-Insensitive Matching
   
   **Pros:**
   - ✅ **Aligns with DuckDB** — improves compatibility with a major SQL database
   - ✅ **More forgiving** — handles common casing variations (e.g., JSON 
sources with inconsistent field names)
   - ✅ **Follows SQL conventions** — SQL generally treats identifiers as 
case-insensitive
   - ✅ **User-friendly** — reduces friction when working with data from 
different sources
   
   ## Arguments for Keeping Case-Sensitive Matching
   
   **Pros:**
   - ✅ **Arrow foundation** — DataFusion is built on Apache Arrow, which is 
case-sensitive:
   - ✅ **Language consistency** — matches Rust and JSON conventions 
(case-sensitive)
   - ✅ **Prevents ambiguity** — avoids edge cases where source has both `x` and 
`X` (rare but possible)
   - ✅ **Predictable behavior** — case-sensitive matching is more explicit and 
easier to reason about in programmatic contexts
   
   ## Question
   
   **Should DataFusion follow SQL's case-insensitivity or remain aligned with 
Arrow's case-sensitive semantics?**
   
   
   ## Next Steps
   
   This issue is intended to **surface the design question** and gather 
community feedback. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to