[
https://issues.apache.org/jira/browse/IMPALA-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687048#comment-16687048
]
Tim Armstrong commented on IMPALA-7850:
---------------------------------------
I'm not sure this is actually a bug (or, assuming that it's a bug, that fixing
the bug would lead to the hoped-for results). Impala treats VALUES statements
more-or-less as an inline table, so it's equivalent for typing purposes to
putting the CHAR literals with different types into a single column. I think
Impala's approach here is generally reasonable approach. It's in theory
possible to have different per-row types but that would be generally
inconsistent to do that in this special case.
I don't expect this pattern to give correct results in general - if we changed
the behaviour to do something else it would just cause a different set of
issues. The issue is that CHAR doesn't distinguish between its own padding and
actual trailing spaces, so there's no way to do a round-trip from
STRING->CHAR->STRING that reliably preserves the presence or absence of
trailing spaces correctly.
> INSERT using VALUES with "CAST" can cause trailing spaces.
> ----------------------------------------------------------
>
> Key: IMPALA-7850
> URL: https://issues.apache.org/jira/browse/IMPALA-7850
> Project: IMPALA
> Issue Type: Bug
> Reporter: Sudarshan
> Priority: Major
>
> INSERT using VALUES with "CAST" can cause trailing spaces. p.s.b.
>
> Schema :-
> ============
>
> {code:java}
> create database tmp;
> CREATE TABLE tmp.tablename ( col_id int,col_second string, col_third
> string);{code}
>
> Insert statement :-
> =====================
>
> {code:java}
> INSERT INTO tmp.tablename(col_id, col_second, col_third) values (100,
> CAST('AWESOME' AS CHAR(7)), CAST('TEST' AS CHAR(4))), (1, CAST('I' AS
> CHAR(1)), CAST('AI' AS CHAR(2))){code}
>
>
> File on HDFS :-
> ================
>
> {noformat}
> [admin@host-10-17-101-151 ~]$ cat
> 9d42419642cbf42e-ffb7c99c00000000_1661109707_data.0.
> 100,AWESOME,TEST
> 1,I ,AI <== Trailing space
> [admin@host-10-17-101-151 ~]${noformat}
>
> Query showing length of "I" as 7
>
> {noformat}
> Query: select col_id, length(col_second), col_second from tmp.tablename
> | col_id | length(col_second) | col_second |
> +------------+--------------+---------+
> | 100 | 7 | AWESOME |
> | 1 | 7 | I |
> +------------+--------------+---------+
> [host-10-17-102-128.coe.cloudera.com:21000] >{noformat}
>
> Workaround :-
> =============
> Workaround would be to remove CAST from above statements.
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]