[ 
https://issues.apache.org/jira/browse/DRILL-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118443#comment-16118443
 ] 

ASF GitHub Bot commented on DRILL-5691:
---------------------------------------

Github user weijietong commented on the issue:

    https://github.com/apache/drill/pull/889
  
    @arina-ielchiieva  your test case can not reproduce the error . You can 
search the dev email to find the origin error description with the keyword 
"Drill query planning error".  Your query already satisfy the 
NestedLoopJoinPrule. My case is that I add another rule to change the 
Aggregate-->Aggregate-->Scan to Scan as the transformed Scan relnode already 
holding the count(distinct ) value. When this transformation occurs, the 
NestedLoopJoinPrule's checkPreconditions method will invoke 
JoinUtils.hasScalarSubqueryInput. Then it will fail, as the transformed relnode 
has no aggregate node which does not satisfy the current scalar rule. 
    
    I think it's hard to reproduce this error without a specific rule like what 
I do. the precondition is:
    1. a nested loop join
    2. no (aggregate--> aggregate) count distinct relation nodes in the plan
    3. the row number of one child of the nested loop join is 1 .
    
    I wonder if the enhanced code does not break the current unit test ,it will 
be ok.
    
    



> multiple count distinct query planning error at physical phase 
> ---------------------------------------------------------------
>
>                 Key: DRILL-5691
>                 URL: https://issues.apache.org/jira/browse/DRILL-5691
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.9.0, 1.10.0
>            Reporter: weijie.tong
>
> I materialized the count distinct query result in a cache , added a plugin 
> rule to translate the (Aggregate、Aggregate、Project、Scan) or 
> (Aggregate、Aggregate、Scan) to (Project、Scan) at the PARTITION_PRUNING phase. 
> Then ,once user issue count distinct queries , it will be translated to query 
> the cache to get the result.
> eg1: " select count(*),sum(a) ,count(distinct b)  from t where dt=xx " 
> eg2:"select count(*),sum(a) ,count(distinct b) ,count(distinct c) from t 
> where dt=xxx "
> eg3:"select count(distinct b), count(distinct c) from t where dt=xxx"
> eg1 will be right and have a query result as I expected , but eg2 will be 
> wrong at the physical phase.The error info is here: 
> https://gist.github.com/weijietong/1b8ed12db9490bf006e8b3fe0ee52269. 
> eg3 will also get the similar error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to