[ https://issues.apache.org/jira/browse/HIVE-7207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ning Zhang updated HIVE-7207: ----------------------------- Description: One of our Hive tables is backed up by Hbase (HBaseStorageHandler), to simulate the partitioned Hive Table by "DataDate", we use composite rowkey in Hbase, e.g. DataDate_Userid_Actionid_Timestamp. The example rowkey is as follow. rowkey: 20140601_784353454593233274_20123282_1401632522132 20140601_784353454_20123282_1401632522132 20140601_784470763593179377_20485247_1401632520825 20140601_784470763593233227_20485222_1401632520821 However, it seems Hive does not support "partial rowkey scan". For example I want to get all data that were generated on 06/01/2014, so I issue the following Hive query, but Hive returns nothing. select * from table where DataDate="20140601"; After several attempts, I found that I have to give exact row key (e.g. 20140601_784353454_20123282_1401632522132) so that Hive can find that record. The reason I want to see the "partial rowkey scan" feature is because: in Hbase, partial table scan should have better performance than full table scan. Is there any plan in Hive community to support "partial rowkey scan" in near future? was: One of our Hive tables is backed up by Hbase (HBaseStorageHandler), to simulate the partitioned Hive Table by "DataDate", we use composite rowkey in Hbase, e.g. DataDate_Userid_Actionid_Timestamp. The example rowkey is as follow. rowkey: 20140601_784353454593233274_20123282_1401632522132 20140601_784353454_20123282_1401632522132 20140601_784470763593179377_20485247_1401632520825 20140601_784470763593233227_20485222_1401632520821 However, it seems Hive does not support "partial rowkey scan". For example I want to get all data that were generated on 06/01/2014, so I issue the following Hive query, but Hive returns nothing. select * from table where DataDate="20140601"; After several attempts, I found that I have to give exact row key (e.g. 20140601_784353454_20123282_1401632522132) so that Hive can find that record. The reason I want to see the "partial rowkey scan" feature is because: in Hbase, partial table scan should have better performance than full table scan. Given Is there any plan in Hive community to support "partial rowkey scan" in near future? > support partial rowkey scan in HBase filter pushdown > ---------------------------------------------------- > > Key: HIVE-7207 > URL: https://issues.apache.org/jira/browse/HIVE-7207 > Project: Hive > Issue Type: Improvement > Reporter: Ning Zhang > Priority: Minor > Labels: Hbase, HbaseStorageHandler > > One of our Hive tables is backed up by Hbase (HBaseStorageHandler), to > simulate the partitioned Hive Table by "DataDate", we use composite rowkey in > Hbase, e.g. DataDate_Userid_Actionid_Timestamp. The example rowkey is as > follow. > rowkey: > 20140601_784353454593233274_20123282_1401632522132 > 20140601_784353454_20123282_1401632522132 > 20140601_784470763593179377_20485247_1401632520825 > 20140601_784470763593233227_20485222_1401632520821 > However, it seems Hive does not support "partial rowkey scan". For example I > want to get all data that were generated on 06/01/2014, so I issue the > following Hive query, but Hive returns nothing. > select * from table where DataDate="20140601"; > After several attempts, I found that I have to give exact row key (e.g. > 20140601_784353454_20123282_1401632522132) so that Hive can find that record. > The reason I want to see the "partial rowkey scan" feature is because: in > Hbase, partial table scan should have better performance than full table scan. > Is there any plan in Hive community to support "partial rowkey scan" in near > future? -- This message was sent by Atlassian JIRA (v6.2#6252)