eugen yushin created DRILL-4666:
-----------------------------------

             Summary: Pushdown doesn't apply for HBase with substr(key) from UT
                 Key: DRILL-4666
                 URL: https://issues.apache.org/jira/browse/DRILL-4666
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.6.0
            Reporter: eugen yushin


Following 
[example|https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java]
 section, running query from {{testFilterPushDownCompositeBigIntRowKey1()}} 
results in following execution plan:
{code}
EXPLAIN PLAN FOR
SELECT
     CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'bigint_be') d
    ,CONVERT_FROM(BYTE_SUBSTR(row_key, 9, 8), 'bigint_be') id
    ,CONVERT_FROM(tableName.f.c, 'UTF8')
FROM hbase.`TestTableCompositeDate` tableName
WHERE
    CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'bigint_be') = cast(1409040000000 
as bigint)
;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(d=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($0, 1, 8))], 
id=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($0, 9, 8))], 
EXPR$2=[CONVERT_FROMUTF8(ITEM($1, 'c'))])
00-02        SelectionVectorRemover
00-03          Filter(condition=[=(CONVERT_FROM(BYTE_SUBSTR($0, 1, 8), 
'bigint_be'), 1409040000000)])
00-04            Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
[tableName=TestTableCompositeDate, startRow=null, stopRow=null, filter=null], 
columns=[`*`]]])
{code}

>From the above, Drill uses full scan and then filters out rows by key 
>substring started from 1st position.

This query executes pretty fast in test dataset provided in repo, but 
performance dramatically decreases with real use cases.

I've used 
_contrib\storage-hbase\src\test\java\org\apache\drill\hbase\TestTableGenerator.java_
 to populate test table.

Moreover, [TestHBaseFilterPushDown|TestHBaseFilterPushDown.java] uses 
[runHBaseSQLVerifyCount|https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java]
 to pass the tests. It checks result set count, and not execution plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to