924060929 opened a new pull request, #64633:
URL: https://github.com/apache/doris/pull/64633
## Proposed changes
When pushing a `TopN`/`Limit` down through `Union`/`Join`/`Window`, the
child operator's limit is
computed as `limit + offset`. Both are non-negative `long`s, so when they
are close to `BIGINT_MAX`
(e.g. `LIMIT 9223372036854775807 OFFSET 9223372036854775807`) the addition
overflows the `long` range
and wraps to a negative value.
A negative limit is an illegal plan. On the BE side it is reinterpreted as a
huge unsigned value
(`uint64_t limit = _offset + _limit` in the sorter), so a trivial query that
should immediately return
an empty set instead runs until it hits the query timeout.
### Minimal reproducer (no table required)
```sql
select count(*) as c from (
select id from (
select 1 as id union all select 2 as id union all select 3 as id
) t
order by id limit 9223372036854775807 offset 9223372036854775807
) s;
```
- Original planner, or Nereids with `PUSH_DOWN_TOP_N_THROUGH_UNION`
disabled: returns `0` immediately
(correct — the offset is far beyond the 3 input rows).
- Nereids with the rule enabled: times out.
### Fix
Add `Utils.saturatedAdd(long, long)`, which clamps to `Long.MAX_VALUE` on
positive overflow instead of
wrapping, and use it everywhere a child limit is derived from `limit +
offset`:
- `PushDownTopNThroughUnion` / `PushDownTopNDistinctThroughUnion`
- `PushDownTopNThroughJoin` / `PushDownTopNDistinctThroughJoin`
- `PushDownTopNThroughWindow`
- `SplitLimit`
`Long.MAX_VALUE` ("all rows") is the semantically correct upper bound: no
relation can hold more than
`Long.MAX_VALUE` rows, so the pushed-down limit never drops rows the parent
may need, and the parent
operator still applies the real `limit`/`offset`. For non-overflowing inputs
the behavior is unchanged.
### Tests
- `UtilsTest#testSaturatedAdd` covers normal, positive-overflow and
negative-overflow cases.
- A regression case in `push_down_top_n_through_union` asserts the
reproducer returns an empty result
(count `0`) without timing out.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]