[jira] [Comment Edited] (CALCITE-3998) Bad datatype for sum(n), it should be BIGINT but it is sometimes INTEGER

TANG Wen-hui (Jira) Wed, 13 May 2020 22:11:32 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106893#comment-17106893
 ]


TANG Wen-hui edited comment on CALCITE-3998 at 5/14/20, 5:10 AM:
-----------------------------------------------------------------

The commit 
[https://github.com/apache/calcite/commit/e081c5b4227a57defe47246d8ff3e6f7cce838e4].
 would not change the SqlTypeName,  the SqlTypeName always stay the same with 
the operand. And the changes in this commit which are related to the first plan 
are that the commit has changed AbstractSumSplitter#singleton (but it has not 
changed the SqlTypeName too) when matching AggregateRemoveRule.

I'm curious. why the return type of "sum(n1)" is Bigint when n1 is integer, How 
does it do that? Does this change occurs in Calcite or in HerdDB?


was (Author: winipanda):
The commit 
[https://github.com/apache/calcite/commit/e081c5b4227a57defe47246d8ff3e6f7cce838e4].
 would not change the SqlTypeName,  the SqlTypeName always stay the same with 
the operand. And the changes in this commit which are related to the first plan 
are that the commit has changed AbstractSumSplitter#singleton (but it has not 
changed the SqlTypeName too) when matching AggregateRemoveRule.

I'm curious. why the return type of "sum(n)" is Bigint when n is integer, How 
does it do that? Does this change occurs in Calcite or in HerdDB?

> Bad datatype for sum(n), it should be BIGINT but it is sometimes INTEGER
> ------------------------------------------------------------------------
>
>                 Key: CALCITE-3998
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3998
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.23.0
>            Reporter: Enrico Olivelli
>            Priority: Blocker
>
> I also noted that sometimes the type of sum(N) where N is an INTEGER column 
> sometimes it is now reported by Calcite as INTEGER and sometimes as a BIGINT. 
> In 1.22 every time is reported as BIGINT.
> So we have another test failing.
> SELECT sum(n1), count(*) as cc, k1
> FROM tblspace1.tsql
> GROUP by k1
> ORDER BY sum(n1)
> Here sum(n1) is reported now a INTEGER, previously it was a BIGINT. I would 
> prefer to see it as a BIGINT in order to prevent overflows
> Here are the plans:
> {noformat}
> INFO: Query: SELECT sum(n1), count(*) as cc, k1  FROM tblspace1.tsql GROUP by 
> k1 ORDER BY sum(n1) -- Logical Plan
> LogicalSort(sort0=[$0], dir0=[ASC]): rowcount = 2.0, cumulative cost = 
> {10.525000095367432 rows, 37.0 cpu, 0.0 io}, id = 1038
>   LogicalProject(EXPR$0=[$1], CC=[$2], K1=[$0]): rowcount = 2.0, cumulative 
> cost = {8.525000095367432 rows, 13.0 cpu, 0.0 io}, id = 1037
>     LogicalAggregate(group=[{0}], EXPR$0=[SUM($1)], CC=[COUNT()]): rowcount = 
> 2.0, cumulative cost = {6.525000095367432 rows, 7.0 cpu, 0.0 io}, id = 1035
>       LogicalProject(K1=[$0], n1=[$1]): rowcount = 2.0, cumulative cost = 
> {4.0 rows, 7.0 cpu, 0.0 io}, id = 1034
>         LogicalTableScan(table=[[tblspace1, tsql]]): rowcount = 2.0, 
> cumulative cost = {2.0 rows, 3.0 cpu, 0.0 io}, id = 1032
> May 12, 2020 11:07:37 AM herddb.sql.CalcitePlanner runPlanner
> INFO: Query: SELECT sum(n1), count(*) as cc, k1  FROM tblspace1.tsql GROUP by 
> k1 ORDER BY sum(n1) -- Best  Plan
> EnumerableSort(sort0=[$0], dir0=[ASC]): rowcount = 2.0, cumulative cost = 
> {5.0 rows, 31.0 cpu, 0.0 io}, id = 1245
>   EnumerableProject(EXPR$0=[$1], CC=[1:BIGINT], K1=[$0]): rowcount = 2.0, 
> cumulative cost = {3.0 rows, 7.0 cpu, 0.0 io}, id = 1244
>     EnumerableInterpreter: rowcount = 2.0, cumulative cost = {1.0 rows, 1.0 
> cpu, 0.0 io}, id = 1243
>       BindableTableScan(table=[[tblspace1, tsql]], projects=[[0, 1]]): 
> rowcount = 2.0, cumulative cost = {0.016 rows, 0.024 cpu, 0.0 io}, id = 1055
> {noformat}
> Within the same test case with the same tables the result of this query is 
> not changed
> SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma FROM tblspace1.tsql
> INFO: Query: SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma FROM 
> tblspace1.tsql -- Logical Plan
> {noformat}
> LogicalAggregate(group=[{}], SS=[SUM($0)], MI=[MIN($0)], MA=[MAX($0)]): 
> rowcount = 1.0, cumulative cost = {5.387500047683716 rows, 5.0 cpu, 0.0 io}, 
> id = 1253
>   LogicalProject(n1=[$1]): rowcount = 2.0, cumulative cost = {4.0 rows, 5.0 
> cpu, 0.0 io}, id = 1252
>     LogicalTableScan(table=[[tblspace1, tsql]]): rowcount = 2.0, cumulative 
> cost = {2.0 rows, 3.0 cpu, 0.0 io}, id = 1250
> May 12, 2020 11:08:48 AM herddb.sql.CalcitePlanner runPlanner
> INFO: Query: SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma FROM 
> tblspace1.tsql -- Best  Plan
> EnumerableAggregate(group=[{}], SS=[SUM($0)], MI=[MIN($0)], MA=[MAX($0)]): 
> rowcount = 1.0, cumulative cost = {2.387500047683716 rows, 1.0 cpu, 0.0 io}, 
> id = 1295
>   EnumerableInterpreter: rowcount = 2.0, cumulative cost = {1.0 rows, 1.0 
> cpu, 0.0 io}, id = 1294
>     BindableTableScan(table=[[tblspace1, tsql]], projects=[[1]]): rowcount = 
> 2.0, cumulative cost = {0.012 rows, 0.018000000000000002 cpu, 0.0 io}, id = 
> 1265
> {noformat}
> This is the test on HerdDB
> https://github.com/diennea/herddb/blob/vote-calcite-123/herddb-core/src/test/java/herddb/sql/SimplerPlannerTest.java#L237



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (CALCITE-3998) Bad datatype for sum(n), it should be BIGINT but it is sometimes INTEGER

Reply via email to