[ 
https://issues.apache.org/jira/browse/IMPALA-13463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891035#comment-17891035
 ] 

ASF subversion and git services commented on IMPALA-13463:
----------------------------------------------------------

Commit 88cb9c19083ae8c2bc70d373a1a70384d476b9cd in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=88cb9c190 ]

IMPALA-13463: Impala should ignore case of Iceberg schema elements

Schema is case insensitive in Impala. Via Spark it's possible to create
schema elements with upper/lower case letters and store them in the
metadata JSON files of Iceberg, e.g.:
   "schemas" : [ {
     "type" : "struct",
     "schema-id" : 0,
     "fields" : [ {
       "id" : 1,
       "name" : "ID",
       "required" : false,
       "type" : "string"
     }, {
       "id" : 2,
       "name" : "OWNERID",
       "required" : false,
       "type" : "string"
     } ]
   } ],

This can cause problems in Impala during predicate pushdown, as we can
get a ValidationException from the Iceberg library (as Impala pushes
down predicates with lower case column names, while Iceberg sees upper
case names).

With this patch Impala invokes Scan.caseSensitive(boolean caseSensitive)
on the TableScan object to set case insensitivity.

Testing:
 * added e2e test

Change-Id: Iedaf152d8a0c02a124c3dcf8acb59b4ba4e81cf4
Reviewed-on: http://gerrit.cloudera.org:8080/21950
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Wenzhe Zhou <[email protected]>
Reviewed-by: Daniel Becker <[email protected]>


> Impala should ignore case of Iceberg schema elements
> ----------------------------------------------------
>
>                 Key: IMPALA-13463
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13463
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> Schema is case insensitive in Impala.
> Via Spark it's possible to create schema elements with upper/lower case 
> letters and store them in the metadata JSON files of Iceberg, e.g.:
> {noformat}
>    "schemas" : [ {
>      "type" : "struct",
>      "schema-id" : 0,
>      "fields" : [ {
>        "id" : 1,
>        "name" : "ID",
>        "required" : false,
>        "type" : "string"
>      }, {
>        "id" : 2,
>        "name" : "OWNERID",
>        "required" : false,
>        "type" : "string"
>      } ]
>    } ],
> {noformat}
> This can cause problems in Impala during predicate pushdown, as we can get a 
> ValidationException from the Iceberg library (as Impala pushes down 
> predicates with lower case column names, while Iceberg sees upper case names).
> We should invoke Scan.caseSensitive(boolean caseSensitive) on the TableScan 
> object to set case insensitivity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to