[ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869325#comment-15869325
 ] 

Zoltan Haindrich commented on HIVE-6590:
----------------------------------------

partition keys are parsed here:
https://github.com/apache/hive/blob/d357f38521ae583007ff96ed7090ac41f56b78b2/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java#L257

I was expecting a whole myriad of qtests to fail when I submitted the patch the 
first time...but there were only just a few - how much does the fact that 
something is "under-tested" correlates with its being "under-used" ? :)

Anyway; of course its possible to apply local changes to the partition key 
parsing code...but I think there is an alternative path:

* parse empty strings and "false" as false (optionally ignoring casing)
* all others as true

this would be a minimal change to the serde code which would be enough to fix 
the partition parsing.

about cast: I haven't looked into sql2011 specs about this aspect, but i'm 
pretty confident that it will suggest the following {{cast('false' as 
boolean)}} should be false...

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-6590
>                 URL: https://issues.apache.org/jira/browse/HIVE-6590
>             Project: Hive
>          Issue Type: Bug
>          Components: Database/Schema, Metastore
>    Affects Versions: 0.10.0
>            Reporter: Lenni Kuff
>            Assignee: Zoltan Haindrich
>         Attachments: HIVE-6590.1.patch, HIVE-6590.2.patch, HIVE-6590.3.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to