[
https://issues.apache.org/jira/browse/CALCITE-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482711#comment-16482711
]
Atri Sharma edited comment on CALCITE-2129 at 5/21/18 5:00 PM:
---------------------------------------------------------------
PR: [https://github.com/apache/calcite/pull/695]
Please see and let me know
was (Author: atris):
[~julianhyde] PR: [https://github.com/apache/calcite/pull/695]
Please see and let me know
> RelToSqlConverter incorrectly projects aggregate function from sub-query
> ------------------------------------------------------------------------
>
> Key: CALCITE-2129
> URL: https://issues.apache.org/jira/browse/CALCITE-2129
> Project: Calcite
> Issue Type: Bug
> Components: core, jdbc-adapter
> Affects Versions: 1.15.0
> Reporter: Paul Jackson
> Assignee: Atri Sharma
> Priority: Major
>
> Starting with this DDL:
> {code:SQL}
> create table friends (
> id int not null,
> friend_id int,
> primary key (id),
> foreign key (friend_id) references friends (id)
> );
> insert into friends values
> (210, null),
> (209, 210),
> (202, 209),
> (208, 202),
> (207, 209),
> (203, 207),
> (201, null),
> (204, null),
> (205, null),
> (206, 209);
> {code}
> This query gives the highest id of friends that share friend_id for each
> friend:
> {code:SQL}
> SELECT id, friend_id,
> (SELECT max(f2.id)
> FROM friends AS f2
> WHERE f2.friend_id = f1.friend_id) AS foaf_id
> FROM friends AS f1
> id friend foaf
> 201
> 202 209 207
> 203 207 203
> 204
> 205
> 206 209 207
> 207 209 207
> 208 202 208
> 209 210 209
> 210
> {code}
> I convert this query to a RelNode and then converted it back to SQL:
> {code:Java}
> Planner aPlanner = Frameworks.getPlanner(aConfig);
> SqlNode aQuery = aPlanner.parse(theSql);
> aQuery = aPlanner.validate(aQuery);
> RelNode aRelNode = aPlanner.rel(aQuery).project();
> RelToSqlConverter aSqlConverter = new RelToSqlConverter(aSqlDialect);
> SqlNode aSqlNode = aSqlConverter.visitChild(0, aRelNode).asStatement();
> {code}
> This gives the following plan and SQL:
> {code}
> LogicalProject(id=[$0], friend_id=[$1], foaf_id=[$2])
> LogicalProject(id=[$0], friend_id=[$1], EXPR$0=[$3])
> LogicalJoin(condition=[=($1, $2)], joinType=[left])
> JdbcTableScan(table=[[stardog, friends]])
> LogicalAggregate(group=[{0}], EXPR$0=[MAX($1)])
> LogicalProject(friend_id=[$1], id=[$0])
> LogicalProject(id=[$0], friend_id=[$1])
> LogicalFilter(condition=[IS NOT NULL($1)])
> JdbcTableScan(table=[[stardog, friends]])
> SELECT `friends`.`id`, `friends`.`friend_id`, MAX(`id`)
> FROM `stardog`.`friends`
> LEFT JOIN (SELECT `friend_id`, MAX(`id`)
> FROM `stardog`.`friends`
> WHERE `friend_id` IS NOT NULL
> GROUP BY `friend_id`) AS `t1` ON `friends`.`friend_id` = `t1`.`friend_id`
> {code}
> This is a bad conversion. The {{MAX(`id`)}} should not be repeated in the
> outer select. PostgreSQL will complain that the aggregating function requires
> a group by. MySQL returns the max id that has a non-null friend (208), that
> id's friend (202), and the max id of all rows (210):
> {code}
> id friend MAX(`id`)
> 208 202 210
> {code}
> I think the correct SQL should be:
> {code:SQL}
> SELECT `friends`.`id`, `friends`.`friend_id`, foaf_id
> FROM `stardog`.`friends`
> LEFT JOIN (SELECT `friend_id`, MAX(`id`) AS foaf_id
> FROM `stardog`.`friends`
> WHERE `friend_id` IS NOT NULL
> GROUP BY `friend_id`) AS `t1` ON `friends`.`friend_id` = `t1`.`friend_id`
> {code}
> There is code in {{SqlImplementor}} and {{RelToSqlConverter}} that uses an
> {{ordinalMap}} to track what functions are aliases as what identifiers. When
> the SQL is generated, the identifier is replaced with the function. If all
> that code is removed, this works as expected (using {{EXPR$0}} as the alias
> rather than {{foaf_id}}).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)