morrySnow commented on code in PR #64633:
URL: https://github.com/apache/doris/pull/64633#discussion_r3479484084
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalTopNToPhysicalTopN.java:
##########
@@ -46,13 +48,30 @@ public Rule build() {
* mergeTopN(limit, off, require gather) -> localTopN(off+limit, 0,
require any)
*/
private List<PhysicalTopN<? extends Plan>> twoPhaseSort(LogicalTopN<?
extends Plan> logicalTopN) {
- PhysicalTopN<Plan> localSort = new
PhysicalTopN<>(logicalTopN.getOrderKeys(),
- logicalTopN.getLimit() + logicalTopN.getOffset(), 0,
SortPhase.LOCAL_SORT,
- logicalTopN.getLogicalProperties(), logicalTopN.child(0));
int sortPhaseNum = 0;
if (ConnectContext.get() != null) {
sortPhaseNum =
ConnectContext.get().getSessionVariable().sortPhaseNum;
}
+ // The two-phase local sort keeps limit + offset rows. When that
overflows the long range
+ // (e.g. LIMIT and OFFSET both BIGINT_MAX), only the single-phase
gather sort is safe: it
+ // applies limit and offset separately and cannot overflow. The
two-phase MERGE_SORT cannot be
+ // used because ChildrenPropertiesRegulator forbids inserting a gather
Exchange under
+ // GATHER_SORT, so the optimizer cannot produce a valid distributed
plan with only one-phase.
+ // If sort_phase_num != 1 (i.e. the user did not explicitly request
single-phase), report the
Review Comment:
**Suggestion:** When `sortPhaseNum == 0` (the default auto mode), should we
fall back to single-phase sort instead of throwing `AnalysisException`?
When `sortPhaseNum == 0`, the system is in auto mode — the user didn't
explicitly request two-phase sort. The original code returns both two-phase and
one-phase alternatives for the optimizer to choose from. With overflow
detected, couldn't we return ONLY the one-phase alternative (as if
`sortPhaseNum == 1`), rather than forcing the user to manually set
`sort_phase_num=1`?
The comment mentions `ChildrenPropertiesRegulator` forbids inserting a
gather Exchange under GATHER_SORT. Could you elaborate on whether this is truly
insurmountable for the auto case? A query that works correctly with
`sort_phase_num=1` but fails with `sort_phase_num=0` (default) is surprising UX
for users hitting this edge case.
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SplitLimit.java:
##########
@@ -38,8 +40,13 @@ public Rule build() {
.then(limit -> {
long l = limit.getLimit();
long o = limit.getOffset();
+ if (Utils.addOverflows(l, o)) {
+ throw new AnalysisException(
+ "limit + offset overflows the long range in
two-phase limit");
Review Comment:
**Suggestion:** Consider returning `null` (skip the rule) instead of
throwing `AnalysisException` here.
`SplitLimit` is an optimization rule — splitting a limit into LOCAL + GLOBAL
phases. When overflow makes the split impossible, the unsplit (single-phase)
limit remains correct: it still applies the original `limit` and `offset`
separately without overflow risk. Returning `null` from `.then()` means the
rule simply doesn't fire, and the unoptimized plan is preserved.
This would be more consistent with the push-down rules (which silently skip
the optimization on overflow) and would avoid failing queries that could
execute correctly without the split.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]