github-actions[bot] commented on code in PR #62043:
URL: https://github.com/apache/doris/pull/62043#discussion_r3032110597
##########
be/src/exprs/aggregate/aggregate_function_window_funnel_v2.h:
##########
@@ -423,8 +423,14 @@ struct WindowFunnelStateV2 {
int event_idx = get_event_idx(evt.event_idx) - 1;
if (event_idx == 0) {
- // Duplicate of event 0: terminate current chain first
+ // Duplicate of event 0: terminate current chain first.
+ // However, if this E0 is from the same row as an event already
+ // in the chain, it's a multi-match row — not a true duplicate.
+ // V1 doesn't break chains on same-row events, so skip it.
if (events_timestamp[0].has_value()) {
+ if (_is_same_row_as_chain(i, curr_level,
events_timestamp)) {
Review Comment:
This `continue` still loses a valid restart candidate in deduplication mode.
Concrete case with window = 15 and conditions `E0, E1, E2`:
- `r0`: `E0 @ 00`
- `r1`: `E0 + E1 @ 10`
- `r2`: `E1 @ 11`
- `r3`: `E2 @ 12`
`window_funnel`/V1 can return `3` by starting a new chain from `r1` (`E0@r1
-> E1@r2 -> E2@r3`). Here V2 processes `E1@r1` first, then hits this branch for
`E0@r1` and skips it because the row is already part of the current chain. That
throws away the only viable restart start, so `E1@r2` is later treated as a
duplicate and the result stays `2`.
So the same-row check is too broad for `event_idx == 0`: it avoids the
reported breakage, but it also suppresses legitimate chain restarts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]