[ 
https://issues.apache.org/jira/browse/IMPALA-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17043097#comment-17043097
 ] 

Quanlong Huang commented on IMPALA-7784:
----------------------------------------

Also find another bug that unescaped string values are unescaped again (set 
needsUnescaping_=true) in coordinators when deserializing them from Thrift 
objects. For example, creating a partition with value = "\\\"", the coordinator 
finally gets the value as "\"":
{code:sql}
hive> create table tpart (i int) partitioned by (p string);
hive> insert into tpart partition (p="\"") values (1);
hive> insert into tpart partition (p='\'') values (2);
hive> insert into tpart partition (p="\\\"") values (3);
hive> insert into tpart partition (p='\\\'') values (4);
hive> select * from tpart;
+----------+----------+
| tpart.i  | tpart.p  |
+----------+----------+
| 1        | "        |
| 2        | '        |
| 3        | \"       |
| 4        | \"       |
+----------+----------+

impala> invalidate metadata tpart;
impala> show partitions tpart;
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+------------------------------------------------------+
| p     | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
Incremental stats | Location                                             |
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+------------------------------------------------------+
| "     | -1    | 1      | 2B   | NOT CACHED   | NOT CACHED        | TEXT   | 
false             | hdfs://localhost:20500/test-warehouse/tpart/p=%22    |
| "     | -1    | 1      | 2B   | NOT CACHED   | NOT CACHED        | TEXT   | 
false             | hdfs://localhost:20500/test-warehouse/tpart/p=%5C%22 |
| '     | -1    | 1      | 2B   | NOT CACHED   | NOT CACHED        | TEXT   | 
false             | hdfs://localhost:20500/test-warehouse/tpart/p=%27    |
| '     | -1    | 1      | 2B   | NOT CACHED   | NOT CACHED        | TEXT   | 
false             | hdfs://localhost:20500/test-warehouse/tpart/p=%5C%27 |
| Total | -1    | 4      | 8B   | 0B           |                   |        |   
                |                                                      |
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+------------------------------------------------------+
impala> select * from tpart;
+---+---+
| i | p |
+---+---+
| 3 | " |
| 2 | ' |
| 1 | " |
| 4 | " |
+---+---+
{code}
The cause is that 
[LiteralExpr#fromThrift()|https://github.com/apache/impala/blob/2c54dbe22507661664b39cb76849f794cf4743d6/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java#L147]
 calls 
[LiteralExpr#create()|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java#L90]
 which always marks StringLiteral's needsUnescaping_ as true. So the unescaped 
string values will be unescaped again when used in coordinators.

I'm working on a patch to fix these together.

> Partition pruning handles escaped strings incorrectly
> -----------------------------------------------------
>
>                 Key: IMPALA-7784
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7784
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.0
>            Reporter: Csaba Ringhofer
>            Assignee: Quanlong Huang
>            Priority: Critical
>              Labels: correctness
>
> Repro:
> {code}
> create table tpart (i int) partitioned by (p string)
> insert into tpart partition (p="\"") values (1);
> select  * from tpart where p = "\"";
> Result;
> Fetched 0 row(s)
> select  * from tpart where p = '"';
> Result:
> 1,""""
> {code}
> Hive returns the row for both queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to