eugen yushin created DRILL-4666: ----------------------------------- Summary: Pushdown doesn't apply for HBase with substr(key) from UT Key: DRILL-4666 URL: https://issues.apache.org/jira/browse/DRILL-4666 Project: Apache Drill Issue Type: Bug Affects Versions: 1.6.0 Reporter: eugen yushin
Following [example|https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java] section, running query from {{testFilterPushDownCompositeBigIntRowKey1()}} results in following execution plan: {code} EXPLAIN PLAN FOR SELECT CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'bigint_be') d ,CONVERT_FROM(BYTE_SUBSTR(row_key, 9, 8), 'bigint_be') id ,CONVERT_FROM(tableName.f.c, 'UTF8') FROM hbase.`TestTableCompositeDate` tableName WHERE CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'bigint_be') = cast(1409040000000 as bigint) ; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(d=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($0, 1, 8))], id=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($0, 9, 8))], EXPR$2=[CONVERT_FROMUTF8(ITEM($1, 'c'))]) 00-02 SelectionVectorRemover 00-03 Filter(condition=[=(CONVERT_FROM(BYTE_SUBSTR($0, 1, 8), 'bigint_be'), 1409040000000)]) 00-04 Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=TestTableCompositeDate, startRow=null, stopRow=null, filter=null], columns=[`*`]]]) {code} >From the above, Drill uses full scan and then filters out rows by key >substring started from 1st position. This query executes pretty fast in test dataset provided in repo, but performance dramatically decreases with real use cases. I've used _contrib\storage-hbase\src\test\java\org\apache\drill\hbase\TestTableGenerator.java_ to populate test table. Moreover, [TestHBaseFilterPushDown|TestHBaseFilterPushDown.java] uses [runHBaseSQLVerifyCount|https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java] to pass the tests. It checks result set count, and not execution plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)