[ 
https://issues.apache.org/jira/browse/DRILL-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476587#comment-16476587
 ] 

ASF GitHub Bot commented on DRILL-6374:
---------------------------------------

vdiravka opened a new pull request #1262: DRILL-6374: Transitive Closure leads 
to TPCH Queries regressions and OOM when run concurrency test
URL: https://github.com/apache/drill/pull/1262
 
 
   **The issue**: Using of DRILL_FILTER_ON_JOIN in early planning stage leads 
to impossibility of removing redundant Projects in main LOGICAL stage. 
   
   **Solution**: The main idea to use TRANSITIVE_CLOSURE rules after main 
LOGICAL stage of rules. 
   
   * New STRICT_EQUAL_IS_DISTINCT_FROM predicate for FilterJoinRules is added 
to pulled up redundant filters from Join condition to above filter (similar to 
DrillJoinRule). Redundant conditions in Joins can lead to further errors in 
planning.
   * Change performing LogicalOptimizerRules from LOGICAL stage to 
PARTITION_PRUNING. 
   LOGICAL stage shouldn't involve pruning:
   
https://github.com/vdiravka/drill/blob/87beac7e6f64ce5f6dedfa2b22cc5c9099caeaec/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java#L166
   Hive logical optimizer rules involves pruning:
   
https://github.com/vdiravka/drill/blob/87beac7e6f64ce5f6dedfa2b22cc5c9099caeaec/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java#L171
   
   * Refactoring in DrillPushFilterPastProjectRule.java is made, which isn't 
related to the root cause of the issue.
   * One unit test is enabled, since optimizations for aggregations are done 
before TC.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Transitive Closure leads to TPCH Queries regressions and OOM when run 
> concurrency test
> --------------------------------------------------------------------------------------
>
>                 Key: DRILL-6374
>                 URL: https://issues.apache.org/jira/browse/DRILL-6374
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.14.0
>         Environment: RHEL 7
>            Reporter: Dechang Gu
>            Assignee: Vitalii Diravka
>            Priority: Critical
>             Fix For: 1.14.0
>
>         Attachments: TPCH_09_2_id_2517381b-1a61-3db5-40c3-4463bd421365.json, 
> TPCH_09_2_id_2517497b-d4da-dab6-6124-abde5804a25f.json
>
>
> Run TPCH regression test on Apache Drill 1.14.0 master commit 
> 6fcaf4268eddcb09010b5d9c5dfb3b3be5c3f903 (DRILL-6173), most of the queries 
> regressed.
> In particular, TPC-H Query 9 takes about 4x time (36 sec vs 8.6 sec), 
> comparing to that when run against the parent commit 
> (9173308710c3decf8ff745493ad3e85ccdaf7c37).
> Further in the concurrency test for the commit, with 48 clients each running 
> 16 TPCH queries (so total 768 queries are executed) with 
> planner.width.max_per_node=5, some queries hit OOM and caused 273 queries 
> failed, while for the parent commit all the 768 queries completed 
> successfully.
>  
> Profiles for TPCH_09 in the regression tests are uploaded:
>  * The failing commit  file name: 
> [^TPCH_09_2_id_2517381b-1a61-3db5-40c3-4463bd421365.json],
>  * The parent commit file name: 
> [^TPCH_09_2_id_2517497b-d4da-dab6-6124-abde5804a25f.json] ).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to