[
https://issues.apache.org/jira/browse/DRILL-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001479#comment-15001479
]
Rahul Challapalli commented on DRILL-4076:
------------------------------------------
Marked it as critical since it is a data correction issue.
Below is the Hive DDL :
{code}
DROP TABLE IF EXISTS empty_lengthy;
CREATE EXTERNAL TABLE empty_lengthy (
int_col INT,
varchar_col STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "|"
STORED AS TEXTFILE LOCATION
"/drill/testdata/partition_pruning/hive/empty_lengthy_partitions.tbl";
DROP TABLE IF EXISTS empty_lengthy_p;
CREATE TABLE empty_lengthy_p (
int_col INT
)
PARTITIONED BY (varchar_col STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "|"
STORED AS TEXTFILE LOCATION
"/drill/testdata/partition_pruning/hive/empty_lengthy_partitions_p";
SET hive.exec.dynamic.partition.mode=true;
insert overwrite table empty_lengthy_p partition (varchar_col)
select int_col, case when varchar_col='dfg' then null else varchar_col END from
empty_lengthy;
{code}
Data File :
{code}
1|dhfawriuueiq dshfjklhfiue eiufhwelfhleiruhj ejfwekjlf hsjdkgfhsdjk hjd hdfkh
sdhg dkj hsdhg jds gsdlgd sd hjk sdjhkjdhgsdhg
2|jkdshgf jhg sdgj dlsg jsdgjg jkdhgiergergd fgjgioug8945u irjfoiej0930j
pofkqpgogogj dogj09g djvkldsjgjgirewoie dkflvsd
vkdvskgjiwegjwe;sdkvjsdgfdgksdjgkdjkdjgksjg sdkjgdsjg skdjggj;sdgjd sk;gjsd
3|dfg
4|sdjklhkhjdfgjhdfgkjhdfkjldfsgjdsfkjhdfmnb,cv
5|dfg
6|
7|jkdshgf jhg sdgj dlsg jsdgjg jkdhgiergergd fgjgioug8945u irjfoiej0930j
pofkqpgogogj dogj09g djvkldsjgjgirewoie dkflvsd
vkdvskgjiwegjwe;sdkvjsdgfdgksdjgkdjkdjgksjg sdkjgdsjg skdjggj;sdgjd sk;gjsd
{code}
> Unlike hive drill treats _HIVE_DEFAULT_PARTITION_ as a null value
> -----------------------------------------------------------------
>
> Key: DRILL-4076
> URL: https://issues.apache.org/jira/browse/DRILL-4076
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Hive
> Affects Versions: 1.3.0
> Reporter: Rahul Challapalli
> Priority: Critical
>
> git.commit.id.abbrev=e78e286
> Query From Drill :
> {code}
> select * from hive.empty_lengthy_p where varchar_col is null;
> +----------+--------------+
> | int_col | varchar_col |
> +----------+--------------+
> | 3 | null |
> | 5 | null |
> | 6 | null |
> +----------+--------------+
> {code}
> The same query from hive returns an empty data set.
> Dump of whole table from hive :
> {code}
> select * from empty_lengthy_p;
> OK
> 3 __HIVE_DEFAULT_PARTITION__
> 5 __HIVE_DEFAULT_PARTITION__
> 6 __HIVE_DEFAULT_PARTITION__
> 1 dhfawriuueiq dshfjklhfiue eiufhwelfhleiruhj ejfwekjlf hsjdkgfhsdjk hjd
> hdfkh sdhg dkj hsdhg jds gsdlgd sd hjk sdjhkjdhgsdhg
> 2 jkdshgf jhg sdgj dlsg jsdgjg jkdhgiergergd fgjgioug8945u
> irjfoiej0930j pofkqpgogogj dogj09g djvkldsjgjgirewoie dkflvsd
> vkdvskgjiwegjwe;sdkvjsdgfdgksdjgkdjkdjgksjg sdkjgdsjg skdjggj;sdgjd sk;gjsd
> 7 jkdshgf jhg sdgj dlsg jsdgjg jkdhgiergergd fgjgioug8945u
> irjfoiej0930j pofkqpgogogj dogj09g djvkldsjgjgirewoie dkflvsd
> vkdvskgjiwegjwe;sdkvjsdgfdgksdjgkdjkdjgksjg sdkjgdsjg skdjggj;sdgjd sk;gjsd
> 4 sdjklhkhjdfgjhdfgkjhdfkjldfsgjdsfkjhdfmnb,cv
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)