[jira] [Commented] (DRILL-3710) Make the 20 in-list optimization configurable

ASF GitHub Bot (JIRA) Fri, 22 Jul 2016 11:28:00 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389992#comment-15389992
 ]


ASF GitHub Bot commented on DRILL-3710:
---------------------------------------

Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/552#discussion_r71923747
  
    --- Diff: 
exec/java-exec/src/test/java/org/apache/drill/TestPartitionFilter.java ---
    @@ -376,4 +376,14 @@ public void testPartitionFilterWithLike() throws 
Exception {
         testIncludeFilter(query4, 4, "Filter", 16);
       }
     
    +  @Test //DRILL-3710 Partition pruning should occur with varying IN-LIST 
size
    +  public void testPartitionFilterWithInSubquery() throws Exception {
    +    String query = String.format("select * from 
dfs_test.`%s/multilevel/parquet` where cast (dir0 as int) IN (1994, 1994, 1994, 
1994, 1994, 1994)", TEST_RES_PATH);
    +    /* In list size exceeds threshold - no partition pruning since 
predicate converted to join */
    +    test("alter session set `planner.in_subquery_threshold` = 2");
    --- End diff --
    
    Not sure if it is necessary to check the no-partition-pruning case.  
Basically, the goal of the test is to see if partition pruning works with large 
IN lists. 


> Make the 20 in-list optimization configurable
> ---------------------------------------------
>
>                 Key: DRILL-3710
>                 URL: https://issues.apache.org/jira/browse/DRILL-3710
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization
>    Affects Versions: 1.1.0
>            Reporter: Hao Zhu
>            Assignee: Gautam Kumar Parai
>             Fix For: Future
>
>
> If Drill has more than 20 in-lists , Drill can do an optimization to convert 
> that in-lists into a small hash table in memory, and then do a table join 
> instead.
> This can improve the performance of the query which has many in-lists.
> Could we make "20" configurable? So that we do not need to add duplicate/junk 
> in-list to make it more than 20.
> Sample query is :
> select count(*) from table where col in 
> (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3710) Make the 20 in-list optimization configurable

Reply via email to