[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492558#comment-16492558
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

weijietong commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-392500411
 
 
   @vvysotskyi  The problems I mentioned above do not appear at the master . I 
will close this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16481727#comment-16481727
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

vvysotskyi commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-390423923
 
 
   @weijietong, yes, they should never happen.
   I think the bug is that was used `sum0` instead of `sum` aggregate function 
or vice versa. `sum0` has the non-nullable return type, but the return type of 
`sum` is nullable.
   
   Could you please review `DrillReduceAggregatesRule` and check for this 
problem?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468197#comment-16468197
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

weijietong commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-387589324
 
 
   @vvysotskyi do you make sure that two same name aggs with the same input ref 
but different data types should never happen here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468168#comment-16468168
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

weijietong commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-387584681
 
 
   I have noticed the return type inference codes. The return type is right. To 
current implementation, if there are two agg calls with the same inputs but 
truly with different data types,it will definitely go to choose wrong agg call 
for reuse and cause potential errors.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467536#comment-16467536
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

vvysotskyi commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-387434799
 
 
   @weijietong, the root cause of this error may have another reason. Drill has 
its own rules to determine the return type for aggregate functions, and in the 
most cases, it differs from Calcite rules. I suppose in some places were used 
rules for Calcite, but should be used rules for Drill.
   
   Regarding the creating new calls for the same agg call with the same inputs, 
I suppose it will be ineffective to create the new calls only because return 
types differ. The problem is that return type was chosen incorrectly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467360#comment-16467360
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

weijietong commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-387387738
 
 
   @vvysotskyi  I have tested the jira issue sql on the current master and it 
passed .But another new test case:
   ```
 @Test
 public void testDRILL5913_t() throws Exception {
   test("select n_nationkey, stddev((case when ( bigint_col  ) >0 then 1 
else 0 end)) * 1.0  as col1, avg((case when ( bigint_col) >0 then 1 else 0 
end)) * 1.0 as col2 from " 
+ "(  select n_name,n_nationkey, sum( n_regionkey) as bigint_col 
from cp.`tpch/nation.parquet` group by n_name,n_nationkey ) t group by 
n_nationkey");
   
 }
   ```
   will throw another Exception at Drill version 1.13 with Calcite version 1.15 
but passed at current master. The exception message is:
   ```
   Caused by: java.lang.AssertionError: Type mismatch:
   rel rowtype:
   RecordType(ANY n_nationkey, BIGINT $f1, BIGINT $f2, BIGINT NOT NULL $f3, 
BIGINT $f4) NOT NULL
   equivRel rowtype:
   RecordType(ANY n_nationkey, BIGINT $f1, BIGINT $f2, BIGINT NOT NULL $f3, 
BIGINT NOT NULL $f4) NOT NULL
   ```
   
   All of the main reason is that DrillReduceAggregationRule.reduceAgg invoked 
RexBuilder.addAggCall method whose parameter aggCallMapping acts as a AggCall 
cache. The aggCallMapping cache only care about the call name not the data 
type. The current master code of Calcite does nothing about this part since I 
find this bug. I don't think I can exhaustive all the test cases to prove our 
current master implementation right. But it seems security to have my tuned 
part of codes (validating AggCall cache with data type) to the master to 
prevent any future possible issues.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall 

[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466861#comment-16466861
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

vvysotskyi commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-387284020
 
 
   @weijietong, could you please check that this bug is still reproduced on 
current master? I tried a query from the Jira description and it is finished 
successfully. I suppose it was fixed in the scope of Calcite upgrade.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466813#comment-16466813
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

weijietong commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-387274177
 
 
   @KulykRoman seems you are familiar with this part of codes . Could you also 
take look at this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2018-05-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466802#comment-16466802
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

weijietong commented on issue #1016: DRILL-5913:solve the mixed processing of 
same functions with same inputRefs but di…
URL: https://github.com/apache/drill/pull/1016#issuecomment-387271505
 
 
   @vvysotskyi @amansinha100  could you take a look at this PR. I ever contact 
with @julianhyde . Since Calcite treats stddev stddev_samp input parameter data 
type as their original data type,no cast behavior happens at its` 
AggregateReduceFunctionsRule` implementation.So this error will not happen at 
Calcite. So this PR changes our Drill own `DrillReduceAggregatesRule` 
implementation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>Priority: Major
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2017-11-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249111#comment-16249111
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/1016
  
@amansinha100 maybe you are familiar with this part of codes .  Could you 
give a review ? anyone else will also be welcome.


> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5913) DrillReduceAggregatesRule mixed the same functions of the same inputRef which have different dataTypes

2017-10-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223467#comment-16223467
 ] 

ASF GitHub Bot commented on DRILL-5913:
---

GitHub user weijietong opened a pull request:

https://github.com/apache/drill/pull/1016

DRILL-5913:solve the mixed processing of same functions with same inputRefs 
but di…

`DrillReduceAggregatesRule` mix the processing of same functions with same 
inputRefs but different dataTypes.

The error info and related reproducible sample sql are 
[here](https://issues.apache.org/jira/browse/DRILL-5913) 

I will also try to concat the Calcite devs to make sure whether they agree 
to make the `RexBuilder.addAggCall` distinguish the same `AggregateCall`s with 
same inputRefs but different dataTypes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/weijietong/drill DRILL-5913

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1016.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1016


commit 6a5cee4e8f2c5955f88c20ace182b829a1ecd51e
Author: weijie.tong 
Date:   2017-10-28T12:14:55Z

solve the mixed processing of same functions of same inputRefs but 
different dataTypes




> DrillReduceAggregatesRule mixed the same functions of the same inputRef which 
> have different dataTypes 
> ---
>
> Key: DRILL-5913
> URL: https://issues.apache.org/jira/browse/DRILL-5913
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>
> sample query:
> {code:java}
> select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as 
> int)) as col2 from cp.`employee.json`
> {code}
> error info:
> {code:java}
> org.apache.drill.exec.rpc.RpcException: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> AssertionError: Type mismatch:
> rel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT 
> NULL
> equivRel rowtype:
> RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
> [Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Internal error: Error while applying rule 
> DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.drill.exec.work.foreman.Foreman.run():294
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
>   Caused By (java.lang.AssertionError) Internal error: Error while applying 
> rule DrillReduceAggregatesRule, args 
> [rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
> org.apache.calcite.util.Util.newInternal():792
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
> {code}
> The reason is that stddev_samp(cast(employee_id as int))  will be reduced as 
> sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be 
> reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.  
> The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too 
> . But this sum0($0) 's data type is different from the first time's sum0($0) 
> : one is integer ,the other is bigint . But Calcite's addAggCall method treat 
> them as the same by ignoring their data type. This leads to the bigint 
> sum0($0) be replaced by the integer sum0($0).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)