Ning Zhang created HIVE-7207:
--------------------------------

             Summary: support partial rowkey scan in HBase filter pushdown
                 Key: HIVE-7207
                 URL: https://issues.apache.org/jira/browse/HIVE-7207
             Project: Hive
          Issue Type: Improvement
            Reporter: Ning Zhang
            Priority: Minor


One of our Hive tables is backed up by Hbase (HBaseStorageHandler), to simulate 
the partitioned Hive Table by "DataDate", we use composite rowkey in Hbase, 
e.g. DataDate_Userid_Actionid_Timestamp. The example rowkey is as follow.

rowkey:
20140601_784353454593233274_20123282_1401632522132
20140601_784353454_20123282_1401632522132
20140601_784470763593179377_20485247_1401632520825
20140601_784470763593233227_20485222_1401632520821

However, it seems Hive does not support "partial rowkey scan". For example I 
want to get all data that were generated on 06/01/2014, so I issue the 
following Hive query, but Hive returns nothing.

select * from table where DataDate="20140601";

After several attempts, I found that I have to give exact row key (e.g. 
20140601_784353454_20123282_1401632522132) so that Hive can find that record.

The reason I want to see the "partial rowkey scan" feature is because: in 
Hbase, partial table scan should have better performance than full table scan.

Given Is there any plan in Hive community to support "partial rowkey scan" in 
near future?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to