silundong commented on code in PR #4754:
URL: https://github.com/apache/calcite/pull/4754#discussion_r2702403859
##########
core/src/main/java/org/apache/calcite/sql2rel/TopDownGeneralDecorrelator.java:
##########
@@ -598,50 +599,48 @@ public RelNode unnestInternal(Sort sort, boolean
allowEmptyOutputFromRewrite) {
RelCollation shiftCollation = sort.getCollation().apply(targetMapping);
builder.push(newInput);
- if (!sort.collation.getFieldCollations().isEmpty()
- && (sort.offset != null || sort.fetch != null)) {
- // the Sort with ORDER BY and LIMIT or OFFSET have to be changed during
rewriting because
- // now the limit has to be enforced per value of the outer bindings
instead of globally.
- // It can be rewritten using ROW_NUMBER() window function and filtering
on it,
- // see section 4.4 in paper Improving Unnesting of Complex Queries
- List<RexNode> partitionKeys = new ArrayList<>();
- for (CorDef corDef : corDefs) {
- int partitionKeyIndex =
requireNonNull(inputInfo.corDefOutputs.get(corDef));
- partitionKeys.add(builder.field(partitionKeyIndex));
- }
- RexNode rowNumber = builder.aggregateCall(SqlStdOperatorTable.ROW_NUMBER)
- .over()
- .partitionBy(partitionKeys)
- .orderBy(builder.fields(shiftCollation))
- .toRex();
- List<RexNode> projectsWithRowNumber = new ArrayList<>(builder.fields());
- projectsWithRowNumber.add(rowNumber);
- builder.project(projectsWithRowNumber);
-
- List<RexNode> conditions = new ArrayList<>();
- if (sort.offset != null) {
- RexNode greaterThenLowerBound =
- builder.call(
- SqlStdOperatorTable.GREATER_THAN,
- builder.field(projectsWithRowNumber.size() - 1),
- sort.offset);
- conditions.add(greaterThenLowerBound);
- }
- if (sort.fetch != null) {
- RexNode upperBound = sort.offset == null
- ? sort.fetch
- : builder.call(SqlStdOperatorTable.PLUS, sort.offset, sort.fetch);
- RexNode lessThenOrEqualUpperBound =
- builder.call(
- SqlStdOperatorTable.LESS_THAN_OR_EQUAL,
- builder.field(projectsWithRowNumber.size() - 1),
- upperBound);
- conditions.add(lessThenOrEqualUpperBound);
- }
- builder.filter(conditions);
- } else {
- builder.sortLimit(sort.offset, sort.fetch,
builder.fields(shiftCollation));
+ // the Sort have to be changed during rewriting because now the
order/limit/offset has to be
Review Comment:
The paper's original wording is:
> Subqueries with ORDER BY and LIMIT or OFFSET have to be changed....
It doesn't say what to do when only LIMIT or OFFSET is present. However, I
believe that even with only LIMIT or OFFSET, it should be rewritten as a window
function without ORDER clause.
This matches the test case that @rubenada mentioned in his comment. I
performed the same test in the [umbra-db
interface](https://umbra-db.com/interface/), and it indeed rewrite the LIMIT as
a window function. It seems our approach aligns with the paper's intent.
<img width="3792" height="1841" alt="image"
src="https://github.com/user-attachments/assets/35a21fd5-f784-464e-aedb-ce6696089d1f"
/>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]