Hi Henson,

In my understanding RANGE/GROUPS are not allowed when RPR is defined.
(See ISO/IEC 19075-5 section 6.10.2 "ROWS BETWEEN CURRENT ROW AND").
So proper fix would be to error out at the parse/analyze phase if
RANGE/GROUPS are used when RPR is defined.

Vik, Jaconb,
Thoughts?

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

> Hi Ishii-san,
> 
> During code review of nodeWindowAgg.c, I discovered that
> update_reduced_frame()
> can receive the same pos value repeatedly for RANGE/GROUPS frames.
> 
> 
> ----------------------------------------------------------------------
> PATTERN DISCOVERED
> ----------------------------------------------------------------------
> 
> For RANGE/GROUPS frames with peer rows (rows having the same ORDER BY
> value):
> 
>   Data: id=1,2,3 all have val=10 (peer rows)
>         id=4,5 both have val=20 (peer rows)
> 
>   RANGE frame processing:
>   - id=1 processing: pos=0 (frameheadpos points to first peer)
>   - id=2 processing: pos=0 (same frameheadpos)
>   - id=3 processing: pos=0 (same frameheadpos)
>   - id=4 processing: pos=3
>   - id=5 processing: pos=3
> 
> ROWS frames always have sequential pos values (0,1,2,3,4...), but
> RANGE/GROUPS
> can have repeated pos values (0,0,0,3,3...).
> 
> 
> ----------------------------------------------------------------------
> CURRENT IMPLEMENTATION ISSUE
> ----------------------------------------------------------------------
> 
> The current implementation appears to assume sequential processing only,
> which
> is valid for ROWS frames. Two specific issues found:
> 
> 1. In update_reduced_frame():
>    - The pos <= nfaLastProcessedRow check assumes "already processed pos
> doesn't
>      need reprocessing", but RANGE/GROUPS require processing the same pos
>      multiple times.
> 
> 2. In nfa_process_row():
>    - Frame boundary calculation: ctxFrameEnd = matchStartRow + frameOffset
> + 1
>    - This adds frameOffset directly to row position, which works for ROWS
> but
>      may be incorrect for RANGE (value-based) and GROUPS (group-based)
> frames.
> 
> 
> ----------------------------------------------------------------------
> TEST COVERAGE LIMITATION
> ----------------------------------------------------------------------
> 
> The existing RANGE/GROUPS test cases in rpr_base.sql only cover simple
> scenarios
> that don't reach the frame end, which is why this issue wasn't detected.
> 
> Most tests target ROWS frames, with only a small number testing RANGE/GROUPS
> frames.
> 
> 
> ----------------------------------------------------------------------
> PROPOSAL
> ----------------------------------------------------------------------
> 
> Rather than directly modifying these functions, we should consider an
> alternative approach for non-UNBOUNDED frames:
> - Strictly respect the frame head position
> - Minimize lower-level optimizations (absorption/pruning)
> - Avoid creating subsequent contexts for limited reuse benefit
> - For bounded frames, the benefit of context reuse across rows may be
> limited
>   compared to UNBOUNDED frames where contexts can be reused extensively
> - This would simplify RANGE/GROUPS handling at the cost of some optimization
> 
> To implement this properly, we need to broaden our understanding of the
> execution structure through code review of Window clauses and Aggregate
> functions.
> 
> Therefore, for this round of nodeWindowAgg.c review:
> 1. Perform code simplification/refactoring for ROWS frames only
> 2. Add FIXME comments to RANGE/GROUPS test cases
> 3. Defer RANGE/GROUPS handling to a future phase
> 
> There may be other issues beyond RANGE/GROUPS.
> After this code review round is complete, I would like to regroup and
> discuss
> the direction forward.
> 
> Best regards,
> Henson


Reply via email to