[
https://issues.apache.org/jira/browse/CALCITE-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16538821#comment-16538821
]
Sergey Nuyanzin commented on CALCITE-2402:
------------------------------------------
Ok I see thank you
while working on it I faced with some issues/questions, could you please
comment them?
# Are there any concerns to have SqlNodes (only in private) for expanding
variance while in public there is also RexNode? Some more details related to
the question. All not biased fail (denominator {{count - 1}}) in boundary
cases for window aggregated, e.g. {code}0:
jdbc:calcite:model=target/test-classes/mod> select var_samp(age) over(partition
by age) from emps;
+------------+
| EXPR$0 |
+------------+
java.lang.ArithmeticException: / by zero
at Baz$4$1.current(Unknown Source)
{code} It happens only while window aggregation because in regular aggregate
there is CASE statement for that, i.e.
While trying to do the similar in case of window for the code {code}final
SqlNumericLiteral one = SqlLiteral.createExactNumeric("1", pos);
final SqlLiteral nullLiteral = SqlLiteral.createNull(SqlParserPos.ZERO);
denominator = new SqlCase(SqlParserPos.ZERO,
count,
SqlNodeList.of(
SqlStdOperatorTable.EQUALS.createCall(pos,
count, SqlLiteral.createExactNumeric("1",
SqlParserPos.ZERO))
),
SqlNodeList.of(
getCastedSqlNode(nullLiteral, varType, pos, null)
),
SqlStdOperatorTable.MINUS.createCall(pos, count, one));{code} it
fails like {noformat}Caused by: java.lang.UnsupportedOperationException: class
org.apache.calcite.sql.SqlLiteral: NULL
at org.apache.calcite.util.Util.needToImplement(Util.java:921)
at
org.apache.calcite.sql.validate.SqlValidatorImpl.getValidatedNodeType(SqlValidatorImpl.java:1551)
at
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertLiteral(SqlNodeToRexConverterImpl.java:93)
at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4659)
at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3977)
at org.apache.calcite.sql.SqlLiteral.accept(SqlLiteral.java:532)
at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4541)
at
org.apache.calcite.sql2rel.StandardConvertletTable.convertCase(StandardConvertletTable.java:379)
... 37 more{noformat} At the same time if I use RexBuilder and RexNode
instead of SqlNodes - everything works fine.
# For COVAR* and REGR* implementation I need a function to eliminate all rows
where any of two arguments is null. Just double check if there is an existing
function for that or no? Before I thought that {{SqlStdOperatorTable#FILTER}}
could help however it fails that it does not have implementer.
# VARIANCE and VAR_SAMP are implemented as the same (in different rdbms they
implemented a little different e.g.
[Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/VARIANCE.html#GUID-EC33717A-2509-402D-B3BB-7EECB2E4ED8B],
[PostgreSQL|https://www.postgresql.org/docs/8.2/static/functions-aggregate.html#FUNCTIONS-AGGREGATE-STATISTICS-TABLE]).
Double check if Calcite's goes PostegreSQL way of not?
> COVAR_POP/COVAR_SAMP/REGR_SXX/REGR_SYY do not work
> ---------------------------------------------------
>
> Key: CALCITE-2402
> URL: https://issues.apache.org/jira/browse/CALCITE-2402
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Sergey Nuyanzin
> Assignee: Julian Hyde
> Priority: Major
>
> Any query from {code:sql}
> select covar_samp(empno, deptno) from emps;
> select covar_pop(empno, deptno) from emps;
> select regr_sxx(empno, deptno) from emps;
> select regr_syy(empno, deptno) from emps;
> {code}
> fails (the trace is below)
> As I understand the reason is not fully implementation (did not find any
> convertlet for them e.g.).
> From my point of view I could fix this issue however I have a question:
> How these function should be handled? As reducible functions (like
> STDDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP) or as Rex operators in
> RexImpTable?
> Please give some advice here
> {noformat}0: jdbc:calcite:model=target/test-classes/mod> select
> covar_samp(empno, deptno) from emps;
> Error: Error while executing SQL "select covar_samp(empno, deptno) from
> emps": Node [rel#123:Subset#2.ENUMERABLE.[]] could not be implemented;
> planner state:
> Root: rel#123:Subset#2.ENUMERABLE.[]
> Original rel:
> LogicalAggregate(subset=[rel#123:Subset#2.ENUMERABLE.[]], group=[{}],
> EXPR$0=[COVAR_SAMP($0, $1)]): rowcount = 10.0, cumulative cost = {11.25 rows,
> 0.0 cpu, 0.0 io}, id = 119
> LogicalProject(subset=[rel#118:Subset#1.NONE.[]], EMPNO=[$0], DEPTNO=[$2]):
> rowcount = 100.0, cumulative cost = {100.0 rows, 200.0 cpu, 0.0 io}, id = 117
> LogicalTableScan(subset=[rel#116:Subset#0.NONE.[]], table=[[SALES,
> EMPS]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
> id = 103
> Sets:
> Set#0, type: RecordType(INTEGER EMPNO, VARCHAR NAME, INTEGER DEPTNO, VARCHAR
> GENDER, VARCHAR CITY, INTEGER EMPID, INTEGER AGE, BOOLEAN SLACKER, BOOLEAN
> MANAGER, DATE JOINEDAT)
> rel#116:Subset#0.NONE.[], best=null, importance=0.7290000000000001
> rel#103:LogicalTableScan.NONE.[](table=[SALES, EMPS]),
> rowcount=100.0, cumulative cost={inf}
> rel#128:Subset#0.ENUMERABLE.[], best=rel#135,
> importance=0.36450000000000005
> rel#133:EnumerableTableScan.ENUMERABLE.[](table=[SALES,
> EMPS]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
>
> rel#135:EnumerableInterpreter.ENUMERABLE.[](input=rel#132:Subset#0.BINDABLE.[]),
> rowcount=100.0, cumulative cost={51.0 rows, 51.01 cpu, 0.0 io}
> rel#132:Subset#0.BINDABLE.[], best=rel#131,
> importance=0.36450000000000005
> rel#131:BindableTableScan.BINDABLE.[](table=[SALES, EMPS]),
> rowcount=100.0, cumulative cost={1.0 rows, 1.01 cpu, 0.0 io}
> Set#1, type: RecordType(INTEGER EMPNO, INTEGER DEPTNO)
> rel#118:Subset#1.NONE.[], best=null, importance=0.81
>
> rel#117:LogicalProject.NONE.[](input=rel#116:Subset#0.NONE.[],EMPNO=$0,DEPTNO=$2),
> rowcount=100.0, cumulative cost={inf}
> rel#125:Subset#1.ENUMERABLE.[], best=rel#134, importance=0.405
>
> rel#134:EnumerableProject.ENUMERABLE.[](input=rel#128:Subset#0.ENUMERABLE.[],EMPNO=$0,DEPTNO=$2),
> rowcount=100.0, cumulative cost={151.0 rows, 251.01 cpu, 0.0 io}
> Set#2, type: RecordType(INTEGER EXPR$0)
> rel#120:Subset#2.NONE.[], best=null, importance=0.9
>
> rel#119:LogicalAggregate.NONE.[](input=rel#118:Subset#1.NONE.[],group={},EXPR$0=COVAR_SAMP($0,
> $1)), rowcount=10.0, cumulative cost={inf}
>
> rel#127:LogicalAggregate.NONE.[](input=rel#116:Subset#0.NONE.[],group={},EXPR$0=COVAR_SAMP($0,
> $2)), rowcount=10.0, cumulative cost={inf}
> rel#123:Subset#2.ENUMERABLE.[], best=null, importance=1.0
>
> rel#124:AbstractConverter.ENUMERABLE.[](input=rel#120:Subset#2.NONE.[],convention=ENUMERABLE,sort=[]),
> rowcount=10.0, cumulative cost={inf} (state=,code=0)
> java.sql.SQLException: Error while executing SQL "select covar_samp(empno,
> deptno) from emps": Node [rel#123:Subset#2.ENUMERABLE.[]] could not be
> implemented; planner state:
> Root: rel#123:Subset#2.ENUMERABLE.[]
> Original rel:
> LogicalAggregate(subset=[rel#123:Subset#2.ENUMERABLE.[]], group=[{}],
> EXPR$0=[COVAR_SAMP($0, $1)]): rowcount = 10.0, cumulative cost = {11.25 rows,
> 0.0 cpu, 0.0 io}, id = 119
> LogicalProject(subset=[rel#118:Subset#1.NONE.[]], EMPNO=[$0], DEPTNO=[$2]):
> rowcount = 100.0, cumulative cost = {100.0 rows, 200.0 cpu, 0.0 io}, id = 117
> LogicalTableScan(subset=[rel#116:Subset#0.NONE.[]], table=[[SALES,
> EMPS]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
> id = 103
> Sets:
> Set#0, type: RecordType(INTEGER EMPNO, VARCHAR NAME, INTEGER DEPTNO, VARCHAR
> GENDER, VARCHAR CITY, INTEGER EMPID, INTEGER AGE, BOOLEAN SLACKER, BOOLEAN
> MANAGER, DATE JOINEDAT)
> rel#116:Subset#0.NONE.[], best=null, importance=0.7290000000000001
> rel#103:LogicalTableScan.NONE.[](table=[SALES, EMPS]),
> rowcount=100.0, cumulative cost={inf}
> rel#128:Subset#0.ENUMERABLE.[], best=rel#135,
> importance=0.36450000000000005
> rel#133:EnumerableTableScan.ENUMERABLE.[](table=[SALES,
> EMPS]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
>
> rel#135:EnumerableInterpreter.ENUMERABLE.[](input=rel#132:Subset#0.BINDABLE.[]),
> rowcount=100.0, cumulative cost={51.0 rows, 51.01 cpu, 0.0 io}
> rel#132:Subset#0.BINDABLE.[], best=rel#131,
> importance=0.36450000000000005
> rel#131:BindableTableScan.BINDABLE.[](table=[SALES, EMPS]),
> rowcount=100.0, cumulative cost={1.0 rows, 1.01 cpu, 0.0 io}
> Set#1, type: RecordType(INTEGER EMPNO, INTEGER DEPTNO)
> rel#118:Subset#1.NONE.[], best=null, importance=0.81
>
> rel#117:LogicalProject.NONE.[](input=rel#116:Subset#0.NONE.[],EMPNO=$0,DEPTNO=$2),
> rowcount=100.0, cumulative cost={inf}
> rel#125:Subset#1.ENUMERABLE.[], best=rel#134, importance=0.405
>
> rel#134:EnumerableProject.ENUMERABLE.[](input=rel#128:Subset#0.ENUMERABLE.[],EMPNO=$0,DEPTNO=$2),
> rowcount=100.0, cumulative cost={151.0 rows, 251.01 cpu, 0.0 io}
> Set#2, type: RecordType(INTEGER EXPR$0)
> rel#120:Subset#2.NONE.[], best=null, importance=0.9
>
> rel#119:LogicalAggregate.NONE.[](input=rel#118:Subset#1.NONE.[],group={},EXPR$0=COVAR_SAMP($0,
> $1)), rowcount=10.0, cumulative cost={inf}
>
> rel#127:LogicalAggregate.NONE.[](input=rel#116:Subset#0.NONE.[],group={},EXPR$0=COVAR_SAMP($0,
> $2)), rowcount=10.0, cumulative cost={inf}
> rel#123:Subset#2.ENUMERABLE.[], best=null, importance=1.0
>
> rel#124:AbstractConverter.ENUMERABLE.[](input=rel#120:Subset#2.NONE.[],convention=ENUMERABLE,sort=[]),
> rowcount=10.0, cumulative cost={inf}
> at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
> at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
> at
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
> at
> org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
> at sqlline.Commands.execute(Commands.java:823)
> at sqlline.Commands.sql(Commands.java:733)
> at sqlline.SqlLine.dispatch(SqlLine.java:795)
> at sqlline.SqlLine.begin(SqlLine.java:668)
> at sqlline.SqlLine.start(SqlLine.java:373)
> at sqlline.SqlLine.main(SqlLine.java:265)
> Caused by: org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
> [rel#123:Subset#2.ENUMERABLE.[]] could not be implemented; planner state:
> Root: rel#123:Subset#2.ENUMERABLE.[]
> Original rel:
> LogicalAggregate(subset=[rel#123:Subset#2.ENUMERABLE.[]], group=[{}],
> EXPR$0=[COVAR_SAMP($0, $1)]): rowcount = 10.0, cumulative cost = {11.25 rows,
> 0.0 cpu, 0.0 io}, id = 119
> LogicalProject(subset=[rel#118:Subset#1.NONE.[]], EMPNO=[$0], DEPTNO=[$2]):
> rowcount = 100.0, cumulative cost = {100.0 rows, 200.0 cpu, 0.0 io}, id = 117
> LogicalTableScan(subset=[rel#116:Subset#0.NONE.[]], table=[[SALES,
> EMPS]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
> id = 103
> Sets:
> Set#0, type: RecordType(INTEGER EMPNO, VARCHAR NAME, INTEGER DEPTNO, VARCHAR
> GENDER, VARCHAR CITY, INTEGER EMPID, INTEGER AGE, BOOLEAN SLACKER, BOOLEAN
> MANAGER, DATE JOINEDAT)
> rel#116:Subset#0.NONE.[], best=null, importance=0.7290000000000001
> rel#103:LogicalTableScan.NONE.[](table=[SALES, EMPS]),
> rowcount=100.0, cumulative cost={inf}
> rel#128:Subset#0.ENUMERABLE.[], best=rel#135,
> importance=0.36450000000000005
> rel#133:EnumerableTableScan.ENUMERABLE.[](table=[SALES,
> EMPS]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
>
> rel#135:EnumerableInterpreter.ENUMERABLE.[](input=rel#132:Subset#0.BINDABLE.[]),
> rowcount=100.0, cumulative cost={51.0 rows, 51.01 cpu, 0.0 io}
> rel#132:Subset#0.BINDABLE.[], best=rel#131,
> importance=0.36450000000000005
> rel#131:BindableTableScan.BINDABLE.[](table=[SALES, EMPS]),
> rowcount=100.0, cumulative cost={1.0 rows, 1.01 cpu, 0.0 io}
> Set#1, type: RecordType(INTEGER EMPNO, INTEGER DEPTNO)
> rel#118:Subset#1.NONE.[], best=null, importance=0.81
>
> rel#117:LogicalProject.NONE.[](input=rel#116:Subset#0.NONE.[],EMPNO=$0,DEPTNO=$2),
> rowcount=100.0, cumulative cost={inf}
> rel#125:Subset#1.ENUMERABLE.[], best=rel#134, importance=0.405
>
> rel#134:EnumerableProject.ENUMERABLE.[](input=rel#128:Subset#0.ENUMERABLE.[],EMPNO=$0,DEPTNO=$2),
> rowcount=100.0, cumulative cost={151.0 rows, 251.01 cpu, 0.0 io}
> Set#2, type: RecordType(INTEGER EXPR$0)
> rel#120:Subset#2.NONE.[], best=null, importance=0.9
>
> rel#119:LogicalAggregate.NONE.[](input=rel#118:Subset#1.NONE.[],group={},EXPR$0=COVAR_SAMP($0,
> $1)), rowcount=10.0, cumulative cost={inf}
>
> rel#127:LogicalAggregate.NONE.[](input=rel#116:Subset#0.NONE.[],group={},EXPR$0=COVAR_SAMP($0,
> $2)), rowcount=10.0, cumulative cost={inf}
> rel#123:Subset#2.ENUMERABLE.[], best=null, importance=1.0
>
> rel#124:AbstractConverter.ENUMERABLE.[](input=rel#120:Subset#2.NONE.[],convention=ENUMERABLE,sort=[]),
> rowcount=10.0, cumulative cost={inf}
> at
> org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:448)
> at
> org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:298)
> at
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:666)
> at org.apache.calcite.tools.Programs$5.run(Programs.java:326)
> at
> org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:387)
> at org.apache.calcite.prepare.Prepare.optimize(Prepare.java:188)
> at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:319)
> at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:230)
> at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:783)
> at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:642)
> at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:612)
> at
> org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:232)
> at
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:609)
> at
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
> at
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
> ... 7 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)