Hi Henson, Thank you for the careful review. This is very helpful.
I updated the first three patches to expand the regression coverage, and also updated patch 3 to make the current memory limitation of the grow-only hash path explicit in both the commit message and code comments. On Thu, Jun 25, 2026 at 6:03 PM Henson Choi <[email protected]> wrote: > > Hi Haibo, > > This is an initial, high-level pass over v2 (patches 1-3), not a detailed > code review -- I'll follow up with more as time permits. For this round I > wanted to confirm the shape of the feature and the cases it rejects, flag > where its behavior diverges from non-window DISTINCT, note test coverage, > and check one interaction. A short list of suggestions is at the end. > > Overall the approach looks reasonable, and for the shapes I tried the > per-row values matched equivalent non-window formulations (correlated / > GROUP BY rewrites, and Oracle for the whole-partition forms). The points > below are mostly about matching our own non-window DISTINCT semantics, not > about the patch being incorrect. > > 1. Added surface > ---------------- > The series makes agg(DISTINCT x) OVER (...) accepted for plain aggregate > window functions (count/sum/avg/...), which PostgreSQL currently rejects. > DISTINCT is carried through parse and deparse so it survives view > definitions, and execution is enabled for a defined subset of frames. > > 2. Behavior by frame shape > -------------------------- > Execution uses one of two strategies depending on the frame: > > - Whole-partition frames (the frame is effectively the entire > partition): the distinct result is the same for every row, so it is > computed once and reused. > > - Running / non-shrinking (grow-only) frames (start at UNBOUNDED > PRECEDING, no EXCLUDE): the distinct set only grows as rows enter, so it > is maintained incrementally per row. > > 3. Cases that are (intentionally) rejected > ------------------------------------------ > - DISTINCT on an aggregate that takes more than one argument (roadmap 7) > - frames that don't start at UNBOUNDED PRECEDING, or that use EXCLUDE > (roadmap 4-6) > - in the running case, an argument type that is not hashable > > 4. Behavior-level observations > ------------------------------ > Compared with GROUP BY agg(DISTINCT x), three things differ between the two > frame strategies: > > - Type support: a type that is sortable but not hashable is accepted in > the whole-partition case yet rejected in the running case, whereas > GROUP BY DISTINCT accepts it (e.g. count(DISTINCT x::money)). > That makes sense as a longer-term direction. For this series, though, I would prefer to keep the current split. My goal in patches 2 and 3 is to add the basic executor support in small, reviewable steps: first the whole-partition sort case, then the simpler grow-only incremental case, before moving on to true sliding frames. For the current patch 3 implementation, the grow-only path assumes hashable input types and errors out otherwise. That is an executor-level decision for this initial step. A planner-driven choice between sort and hash paths remains an independent direction that can be explored later. > - Value order: for order-sensitive aggregates (e.g. array_agg(DISTINCT > ...)), the order in which distinct values are aggregated differs > between the two frame strategies, so the same expression can produce > differently-ordered results. > I agree this is worth calling out more explicitly. For order-insensitive aggregates such as count/sum/min/max, this is not observable. For order-sensitive aggregates such as array_agg or string_agg, the result can differ between strategies if no aggregate- local ORDER BY is present. I do not consider this a correctness issue, because the SQL standard does not prescribe an ordering for that form. However, it is a visible plan-dependent behavior difference and should be documented as such. I also agree that regression coverage should not rely on unspecified ordering. In the updated tests I made the order-sensitive grow-only case deterministic by driving the frame with a unique ORDER BY key, so the expected result does not depend on flaky input ordering. > - Memory: the running (hash) path is the only one of these that holds > its distinct set in an unbounded in-memory structure; everything else, > including the patch's own whole-partition path, is work_mem-bounded and > spills to disk: > > whole-partition (this patch) sort work_mem-bounded, spills > running (this patch) hash unbounded, no spill > GROUP BY agg(DISTINCT x) sort work_mem-bounded, spills > SELECT DISTINCT, HashAgg hash work_mem-bounded, spills > SELECT DISTINCT, Sort+Unique sort work_mem-bounded, spills > > So a large, high-cardinality partition in the running case can grow the > hash table without bound -- whereas every other path here, including the > hash-based HashAgg, is work_mem-bounded and spills. > I agree this is a real limitation of the current patch 3 approach, not the intended end state. I updated patch 3 to make this more explicit both in the commit message and in the code comments. In particular, I added a comment in initialize_windowaggregate() just above the hash-table setup. The current grow-only hash path is not bounded by work_mem and retains every distinct value seen in the partition. I agree that bounded-memory behavior for this path would need further design, rather than being treated as an acceptable final trade-off. > 5. Test coverage (measured) > --------------------------- > I ran line coverage on the changed executor code. The patch's added and > changed lines are about 77% covered (219/284); the file as a whole is > around 91%. The uncovered ~23% of the new lines is not spread out -- it > concentrates in a few behaviors the functional tests never exercise, > because those tests center on count/sum/avg over integer arguments: > > - strict aggregates (max/min DISTINCT), i.e. the strict/NULL-seed path > - pass-by-reference argument and transition types (text/numeric) > - (plus a couple of effectively unreachable guards) > > A few more gaps don't show up as uncovered lines: RANGE frames with peer > rows (tied ORDER BY keys), GROUPS mode, and order-sensitive aggregates > (per the value-order point above). > > 6. Interaction with row pattern recognition (RPR) > ------------------------------------------------- > I applied this series on top of the RPR patch set and built the two > together. They combine cleanly -- no textual (rebase) conflict, and the > regression tests pass without any change to code or tests. > > That they coexist is structural, not luck: RPR requires the frame to start > at CURRENT ROW, while this feature requires UNBOUNDED PRECEDING, so a single > window can satisfy at most one of them -- a combined query is always > rejected by one side or the other. As long as these two frame requirements > stay mutually exclusive, I would expect the two to merge without trouble. > > One asymmetry worth noting: the RPR side's CURRENT-ROW requirement follows > from the standard's definition of row pattern recognition, whereas this > feature's UNBOUNDED-PRECEDING requirement is one the roadmap plans to relax > (the sliding-frame patches). So the open question is whether that > relaxation will ever reach frames that start at CURRENT ROW -- the exact > shape RPR requires, and the point at which the two features would meet. Is > CURRENT-ROW-start support part of the plan? Thanks for checking this interaction. I am glad to hear the current series composes cleanly with the RPR patches. I have not yet studied whether the roadmap would eventually reach CURRENT ROW-start frames. If it does, the RPR interaction will need careful re-evaluation. > > 7. Suggestions > -------------- > - Type support: the planner should decide sort vs hash from the argument > type, and the executor should just carry that out -- a non-hashable type > would then route to sort instead of being rejected at startup. On the type-support and memory points above, I agree those are the main longer-term issues for this approach, and I have tried to clarify the present behavior and limitations in the updated patches. > - Memory: the running hash seen-set grows without bound. It should spill > at work_mem the way HashAgg already does for grouping; hash can stay the > default for the common hashable case. > - Tests: fill the coverage gaps in section 5 -- strict aggregates, > pass-by-reference types, RANGE peers, GROUPS mode. I can provide cases > with cross-checked expected values. Agreed, and I have already expanded the first three patches in that direction in the updated version. Patch 1 itself is unchanged here, since its existing parse/deparse and executor-side rejection coverage was already adequate. In patches 2 and 3, I added pass-by-reference text cases, a strict-aggregate case for the whole-partition path, and a more representative grow-only case using array_agg(DISTINCT …) over a deterministic running frame. The remaining frame-shape coverage, such as RANGE peers and GROUPS mode, will come naturally with the later patches that introduce those frame types, rather than being bolted onto these first three patches. > - Documentation: there are no SGML changes for this new form. The > window-function-call synopsis doesn't show the DISTINCT form, and no > prose mentions it. Agreed. I have not added the SGML changes yet because I wanted to get review on the execution model first, and this initial sub-series is still focused on that part. But this definitely needs to be covered before the full series is complete. > - Roadmap ordering (just a thought): a grow-only frame is the add-only > special case of the sliding machinery in patches 4-6, and the hash path > is where the section-4 divergences live -- so patch 3 reads more like an > optimization than a foundation. I understand the point. From an implementation perspective, though, I still think it is useful to keep the grow-only case separate. Although it can be viewed as an add-only special case of the later sliding machinery, it also lets the incremental DISTINCT state be reviewed on its own before introducing row exit, refcounted removal, and the other moving-frame complications in the same patch. So I see it as both a reviewable intermediate step and a natural strategy for grow-only frames. > > Regards, > Henson Thanks again for the review. Please see the updated patches. Regards, Haibo
v3-0001-Parse-and-deparse-DISTINCT-in-window-aggregate-ca.patch
Description: Binary data
v3-0002-Support-DISTINCT-in-whole-partition-window-aggreg.patch
Description: Binary data
v3-0003-Extend-DISTINCT-window-aggregates-to-grow-only-fr.patch
Description: Binary data
