Aleksey Vovchenko created HIVE-16741: ----------------------------------------
Summary: Counting number of records in hive and hbase are different for NULL fields in hive Key: HIVE-16741 URL: https://issues.apache.org/jira/browse/HIVE-16741 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.1.0, 1.2.0 Reporter: Aleksey Vovchenko Assignee: Aleksey Vovchenko Steps to reproduce: STEP 1. hbase> create 'testTable',{NAME=>'cf'} STEP 2. put 'testTable','10','cf:Address','My Address 411002' put 'testTable','10','cf:contactId','653638' put 'testTable','10','cf:currentStatus','Awaiting' put 'testTable','10','cf:createdAt','1452815193' put 'testTable','10','cf:Id','10' put 'testTable','15','cf:contactId','653638' put 'testTable','15','cf:currentStatus','Awaiting' put 'testTable','15','cf:createdAt','1452815193' put 'testTable','15','cf:Id','15' (Note: Here Addrees column is not provided.It means that NULL.) put 'testTable','20','cf:Address','My Address 411003' put 'testTable','20','cf:contactId','653638' put 'testTable','20','cf:currentStatus','Awaiting' put 'testTable','20','cf:createdAt','1452815193' put 'testTable','20','cf:Id','20' put 'testTable','17','cf:Address','My Address 411003' put 'testTable','17','cf:currentStatus','Awaiting' put 'testTable','17','cf:createdAt','1452815193' put 'testTable','17','cf:Id','17' STEP 3. hive> CREATE external TABLE hh_testTable(Id string,Address string,contactId string,currentStatus string,createdAt string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping"=":key,cf:Address,cf:contactId,cf:currentStatus,cf:createdAt") TBLPROPERTIES ("hbase.table.name"="testTable"); STEP 4. hive> select count(*),contactid from hh_testTable group by contactid; Actual result: OK 3 653638 Expected result: OK 1 NULL 3 653637 -- This message was sent by Atlassian JIRA (v6.3.15#6346)