xabriel opened a new pull request #82: Make expression binding case insensitive
URL: https://github.com/apache/incubator-iceberg/pull/82
 
 
   Iceberg's current implementation has column case sensitivity, which hinders 
usability, as most sql users expect case insensitivity by default. While a 
query like the following will succeed in other Spark Readers, it will fail on 
Iceberg:
   
   ```sql
   SELECT COUNT(*)
   FROM iceTable
   WHERE year = 2017
     AND MONTH = 11 -- Notice how MONTH has different casing than other 
predicates
     AND day = 01
   ```
   
   This will fail with a stack trace similar to:
   ```
   com.google.common.util.concurrent.UncheckedExecutionException: 
com.netflix.iceberg.exceptions.ValidationException: Cannot find field 'MONTH' 
in struct: struct<...>
   ...
   ```
   In this PR, we solve this by making iceberg-api case-insensitive when 
binding expressions.
   
   
   Some further notes:
   
   - We could also solve this in iceberg-spark, however, that implies we would 
have to solve it in any other engine that supports iceberg ( presto, pig, etc ).
   
   - Enabling case insensitivity implies that a table can not have columns were 
`LOWER(a) == LOWER(b)`. I have a `TODO` in the PR as maybe this change makes 
more sense under a feature flag.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to