github-actions[bot] commented on code in PR #62043:
URL: https://github.com/apache/doris/pull/62043#discussion_r3031478392


##########
be/src/exprs/aggregate/aggregate_function_window_funnel_v2.h:
##########
@@ -423,8 +423,14 @@ struct WindowFunnelStateV2 {
             int event_idx = get_event_idx(evt.event_idx) - 1;
 
             if (event_idx == 0) {
-                // Duplicate of event 0: terminate current chain first
+                // Duplicate of event 0: terminate current chain first.
+                // However, if this E0 is from the same row as an event already
+                // in the chain, it's a multi-match row — not a true duplicate.
+                // V1 doesn't break chains on same-row events, so skip it.
                 if (events_timestamp[0].has_value()) {
+                    if (_is_same_row_as_chain(i, curr_level, 
events_timestamp)) {

Review Comment:
   This skip is broader than the bug description and still leaves a correctness 
hole in deduplication mode. If a row contributes `E1` to the current chain and 
also matches `E0`, skipping that `E0` throws away a valid restart candidate.
   
   Concrete case with window = 15:
   - `r0`: `E0 @ 00`
   - `r1`: `E0 + E1 @ 10`
   - `r2`: `E1 @ 11`
   - `r3`: `E2 @ 12`
   
   V1 can return `3` by starting a new chain from `r1` (`E0@r1 -> E1@r2 -> 
E2@r3`). With this code, `E0@r1` is discarded because it is on the same row as 
the already-used `E1`, then `E1@r2` is treated as a duplicate and the result 
stays `2`. So the patch still does not preserve the V1 deduplication semantics 
it is trying to recover here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to