It looks like you're encountering DRILL-4771. Note that the problem is not specific to querying from HBase.
-- Zelaine On Tue, Aug 9, 2016 at 3:09 AM, qiang li <[email protected]> wrote: > I did some research and find out that the logic plan not good if contains > count(distinct ) clause. > > I have to change my sql to select count(*) from (select aa, sum() from tt > group by aa) t to avoid this issue. > > 2016-08-09 14:24 GMT+08:00 qiang li <[email protected]>: > > > Hi > > > > We use drill to query hbase. > > > > Recently I check one of our query's plan: > > 0: jdbc:drill:drillbit=rfdc2> explain plan for SELECT '2016-07-27 00:00' > > as key, sum(convert_from(action.`v`.`c`,'INT_BE')) AS pv,COUNT(DISTINCT > > BYTE_SUBSTR(action.row_key,-8,8)) AS uv, SUM(convert_from(action.`v`.` > s`,'INT_BE')) > > AS v from hbase.`action_combine` as action where action.row_key > > >'020160727pay.search' and action.row_key < '020160727pay.searci' ; > > > > the result is : > > +------+------+ > > | text | json | > > +------+------+ > > | 00-00 Screen > > 00-01 Project(key=[$0], pv=[$1], uv=[$2], v=[$3]) > > 00-02 Project(key=['2016-07-27 00:00'], pv=[$0], uv=[$2], v=[$1]) > > 00-03 NestedLoopJoin(condition=[true], joinType=[inner]) > > 00-05 StreamAgg(group=[{}], pv=[SUM($0)], v=[SUM($1)]) > > 00-07 StreamAgg(group=[{}], pv=[SUM($0)], v=[SUM($2)]) > > 00-08 Project($f0=[CONVERT_FROMINT_BE(ITEM($1, 'c'))], > > $f1=[BYTE_SUBSTR($0, -8, 8)], $f2=[CONVERT_FROMINT_BE(ITEM($1, 's'))]) > > 00-09 Scan(groupscan=[HBaseGroupScan > > [HBaseScanSpec=HBaseScanSpec [tableName=action_combine, > > startRow=020160727pay.search\x00, stopRow=020160727pay.searci, > > filter=null], columns=[`*`]]]) > > 00-04 StreamAgg(group=[{}], uv=[$SUM0($0)]) > > 00-06 UnionExchange > > 01-01 StreamAgg(group=[{}], uv=[COUNT($0)]) > > 01-02 HashAgg(group=[{0}]) > > 01-03 Project($f1=[$0]) > > 01-04 HashToRandomExchange(dist0=[[$0]]) > > 02-01 UnorderedMuxExchange > > 03-01 Project($f1=[$0], > > E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)]) > > 03-02 HashAgg(group=[{0}]) > > 03-03 Project($f1=[BYTE_SUBSTR($0, -8, 8)]) > > 03-04 Scan(groupscan=[HBaseGroupScan > > [HBaseScanSpec=HBaseScanSpec [tableName=action_combine, > > startRow=020160727pay.search\x00, stopRow=020160727pay.searci, > > filter=null], columns=[`*`]]]) > > > > My question is as there have two scan in the plan , does this mean's the > > query will execute twice? > > > > >
