Hi ,

I recently meet a issue that can not query the correct data from hbase with
sql by drill, can anybody help me.

I test with the drill 1.6.
My hbase scheme:
rowkey: salt+day+event+uid + ts , eg: 120160411visituidts
cf : v
qualifier: v, e0, e1

The wrong result only happened when I use group by clause.

This sql will not return correct result:
select CONVERT_FROM(a.`v`.`e0`, 'UTF8') as k, count(a.`v`.`e0`) p from
hbase.browser_action2 a where a.row_key > '0'  group by a.`v`.`e0`;
Part of explain of this sql is:

0: jdbc:drill:zk=rfdc5> explain plan for select CONVERT_FROM(a.`v`.`e0`,
'UTF8') as k, count(a.`v`.`e0`) p from hbase.browser_action2 a where
a.row_key > '0'  group by a.`v`.`e0`;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(k=[$0], p=[$1])
00-02        UnionExchange
01-01          Project(k=[CONVERT_FROMUTF8($0)], p=[$1])
01-02            HashAgg(group=[{0}], p=[$SUM0($1)])
01-03              Project($f0=[$0], p=[$1])
01-04                HashToRandomExchange(dist0=[[$0]])
02-01                  UnorderedMuxExchange
03-01                    Project($f0=[$0], p=[$1],
E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)])
03-02                      HashAgg(group=[{0}], p=[COUNT($0)])
03-03                        Project($f0=[ITEM($1, 'e0')])
03-04                          Scan(groupscan=[HBaseGroupScan
[HBaseScanSpec=HBaseScanSpec [tableName=browser_action2, startRow=0\x00,
stopRow=, filter=null], columns=[`*`]]])

The data return very quickly , the result of this sql is :
+------+--------+
|  k   |   p    |
+------+--------+
| pay  | 12180  |
+------+--------

But I have millons of data in the table.

I tried to change the physical plan.  if I change the json explain *"columns"
: [ "`*`" ]*  to *"columns" : [ "`v`.`e0`" ] *, it will return the correct
result.

It seems the physical plan is not correct.
I also try to debug the sql parser to find out the reason, but its too
complicate. Can anyone help me.

Also this sql have the same issue.
select CONVERT_FROM(BYTE_SUBSTR(a.row_key, 1 , 9), 'UTF8')  as k,
count(a.row_key) p from hbase.browser_action2 a group by
BYTE_SUBSTR(a.row_key, 1 , 9);
I change the json explain *"columns" : [ "`*`" ]*  to *"columns" : [
"`row_key`" ] *, it will return the correct result.

Reply via email to