[
https://issues.apache.org/jira/browse/DRILL-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712244#comment-14712244
]
Hao Zhu commented on DRILL-3710:
--------------------------------
a. No optimization
{code}
explain plan for
select count(1) from h1_passwords where cast(col2 as int) in
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19);
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02 Project($f0=[1])
00-03 SelectionVectorRemover
00-04 Filter(condition=[OR(=(CAST($0):INTEGER, 1),
=(CAST($0):INTEGER, 2), =(CAST($0):INTEGER, 3), =(CAST($0):INTEGER, 4),
=(CAST($0):INTEGER, 5), =(CAST($0):INTEGER, 6), =(CAST($0):INTEGER, 7),
=(CAST($0):INTEGER, 8), =(CAST($0):INTEGER, 9), =(CAST($0):INTEGER, 10),
=(CAST($0):INTEGER, 11), =(CAST($0):INTEGER, 12), =(CAST($0):INTEGER, 13),
=(CAST($0):INTEGER, 14), =(CAST($0):INTEGER, 15), =(CAST($0):INTEGER, 16),
=(CAST($0):INTEGER, 17), =(CAST($0):INTEGER, 18), =(CAST($0):INTEGER, 19))])
00-05 Scan(groupscan=[HiveScan [table=Table(dbName:default,
tableName:h1_passwords),
inputSplits=[maprfs:///user/hive/warehouse/h1_passwords/passwd:0+1680],
columns=[`col2`], partitions= null]])
{code}
b. With optimization
{code}
explain plan for
select count(1) from h1_passwords where cast(col2 as int) in
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20);
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02 Project($f0=[1])
00-03 Project(f6=[$1], ROW_VALUE=[$0])
00-04 MergeJoin(condition=[=($1, $0)], joinType=[inner])
00-06 SelectionVectorRemover
00-08 Sort(sort0=[$0], dir0=[ASC])
00-10 HashAgg(group=[{0}])
00-12 Values
00-05 SelectionVectorRemover
00-07 Sort(sort0=[$0], dir0=[ASC])
00-09 Project(f6=[CAST($0):INTEGER])
00-11 Scan(groupscan=[HiveScan [table=Table(dbName:default,
tableName:h1_passwords),
inputSplits=[maprfs:///user/hive/warehouse/h1_passwords/passwd:0+1680],
columns=[`col2`], partitions= null]])
{code}
> Make the 20 in-list optimization configurable
> ---------------------------------------------
>
> Key: DRILL-3710
> URL: https://issues.apache.org/jira/browse/DRILL-3710
> Project: Apache Drill
> Issue Type: Improvement
> Components: Query Planning & Optimization
> Affects Versions: 1.1.0
> Reporter: Hao Zhu
> Assignee: Jinfeng Ni
>
> If Drill has more than 20 in-lists , Drill can do an optimization to convert
> that in-lists into a small hash table in memory, and then do a table join
> instead.
> This can improve the performance of the query which has many in-lists.
> Could we make "20" configurable? So that we do not need to add duplicate/junk
> in-list to make it more than 20.
> Sample query is :
> select count(*) from table where col in
> (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1);
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)