This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 783ee19fef Publish built docs triggered by 
b477816c8636e171ae000522a2cf0951997b06fb
783ee19fef is described below

commit 783ee19fef2f2fa6dd659a60336a68257156a4e1
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Fri Nov 14 02:47:41 2025 +0000

    Publish built docs triggered by b477816c8636e171ae000522a2cf0951997b06fb
---
 _sources/library-user-guide/upgrading.md.txt       | 21 +++++++++++++++
 _sources/user-guide/sql/aggregate_functions.md.txt | 30 ++++++++++++++++++++++
 library-user-guide/upgrading.html                  | 20 +++++++++++++++
 searchindex.js                                     |  2 +-
 user-guide/sql/aggregate_functions.html            | 26 +++++++++++++++++++
 user-guide/sql/index.html                          |  1 +
 6 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/_sources/library-user-guide/upgrading.md.txt 
b/_sources/library-user-guide/upgrading.md.txt
index 7cc84f424b..8288923008 100644
--- a/_sources/library-user-guide/upgrading.md.txt
+++ b/_sources/library-user-guide/upgrading.md.txt
@@ -25,6 +25,27 @@
 
 You can see the current [status of the `52.0.0` release 
here](https://github.com/apache/datafusion/issues/18566)
 
+### Planner now requires explicit opt-in for WITHIN GROUP syntax
+
+The SQL planner now enforces the aggregate UDF contract more strictly: the
+`WITHIN GROUP (ORDER BY ...)` syntax is accepted only if the aggregate UDAF
+explicitly advertises support by returning `true` from
+`AggregateUDFImpl::supports_within_group_clause()`.
+
+Previously the planner forwarded a `WITHIN GROUP` clause to order-sensitive
+aggregates even when they did not implement ordered-set semantics, which could
+cause queries such as `SUM(x) WITHIN GROUP (ORDER BY x)` to plan successfully.
+This behavior was too permissive and has been changed to match PostgreSQL and
+the documented semantics.
+
+Migration: If your UDAF intentionally implements ordered-set semantics and
+wants to accept the `WITHIN GROUP` SQL syntax, update your implementation to
+return `true` from `supports_within_group_clause()` and handle the ordering
+semantics in your accumulator implementation. If your UDAF is merely
+order-sensitive (but not an ordered-set aggregate), do not advertise
+`supports_within_group_clause()` and clients should use alternative function
+signatures (for example, explicit ordering as a function argument) instead.
+
 ### `AggregateUDFImpl::supports_null_handling_clause` now defaults to `false`
 
 This method specifies whether an aggregate function allows `IGNORE 
NULLS`/`RESPECT NULLS`
diff --git a/_sources/user-guide/sql/aggregate_functions.md.txt 
b/_sources/user-guide/sql/aggregate_functions.md.txt
index f17e09f2ce..ba9c6ae124 100644
--- a/_sources/user-guide/sql/aggregate_functions.md.txt
+++ b/_sources/user-guide/sql/aggregate_functions.md.txt
@@ -48,6 +48,36 @@ FROM employees;
 
 Note: When no rows pass the filter, `COUNT` returns `0` while 
`SUM`/`AVG`/`MIN`/`MAX` return `NULL`.
 
+## WITHIN GROUP / Ordered-set aggregates
+
+Some aggregate functions accept the SQL `WITHIN GROUP (ORDER BY ...)` clause 
to specify the ordering the
+aggregate relies on. In DataFusion this is opt-in: only aggregate functions 
whose implementation returns
+`true` from `AggregateUDFImpl::supports_within_group_clause()` accept the 
`WITHIN GROUP` clause. Attempting to
+use `WITHIN GROUP` with a regular aggregate (for example, `SELECT SUM(x) 
WITHIN GROUP (ORDER BY x)`) will fail
+during planning with an error: "WITHIN GROUP is only supported for ordered-set 
aggregate functions".
+
+Currently, the built-in aggregate functions that support `WITHIN GROUP` are:
+
+- `percentile_cont` — exact percentile aggregate (also available as 
`percentile_cont(column, percentile)`)
+- `approx_percentile_cont` — approximate percentile using the t-digest 
algorithm
+- `approx_percentile_cont_with_weight` — approximate weighted percentile using 
the t-digest algorithm
+
+Note: rank-like functions such as `rank()`, `dense_rank()`, and 
`percent_rank()` are window functions and
+use the `OVER (...)` clause; they are not ordered-set aggregates that accept 
`WITHIN GROUP` in DataFusion.
+
+Example (ordered-set aggregate):
+
+```sql
+percentile_cont(0.5) WITHIN GROUP (ORDER BY value)
+```
+
+Example (invalid usage — planner will error):
+
+```sql
+-- This will fail: SUM is not an ordered-set aggregate
+SELECT SUM(x) WITHIN GROUP (ORDER BY x) FROM t;
+```
+
 ## General Functions
 
 - [array_agg](#array_agg)
diff --git a/library-user-guide/upgrading.html 
b/library-user-guide/upgrading.html
index 67b66e3dd3..496fe152c0 100644
--- a/library-user-guide/upgrading.html
+++ b/library-user-guide/upgrading.html
@@ -407,6 +407,25 @@
 <h2>DataFusion <code class="docutils literal notranslate"><span 
class="pre">52.0.0</span></code><a class="headerlink" href="#datafusion-52-0-0" 
title="Link to this heading">#</a></h2>
 <p><strong>Note:</strong> DataFusion <code class="docutils literal 
notranslate"><span class="pre">52.0.0</span></code> has not been released yet. 
The information provided in this section pertains to features and changes that 
have already been merged to the main branch and are awaiting release in this 
version.</p>
 <p>You can see the current <a class="reference external" 
href="https://github.com/apache/datafusion/issues/18566";>status of the <code 
class="docutils literal notranslate"><span class="pre">52.0.0</span></code> 
release here</a></p>
+<section id="planner-now-requires-explicit-opt-in-for-within-group-syntax">
+<h3>Planner now requires explicit opt-in for WITHIN GROUP syntax<a 
class="headerlink" 
href="#planner-now-requires-explicit-opt-in-for-within-group-syntax" 
title="Link to this heading">#</a></h3>
+<p>The SQL planner now enforces the aggregate UDF contract more strictly: the
+<code class="docutils literal notranslate"><span class="pre">WITHIN</span> 
<span class="pre">GROUP</span> <span class="pre">(ORDER</span> <span 
class="pre">BY</span> <span class="pre">...)</span></code> syntax is accepted 
only if the aggregate UDAF
+explicitly advertises support by returning <code class="docutils literal 
notranslate"><span class="pre">true</span></code> from
+<code class="docutils literal notranslate"><span 
class="pre">AggregateUDFImpl::supports_within_group_clause()</span></code>.</p>
+<p>Previously the planner forwarded a <code class="docutils literal 
notranslate"><span class="pre">WITHIN</span> <span 
class="pre">GROUP</span></code> clause to order-sensitive
+aggregates even when they did not implement ordered-set semantics, which could
+cause queries such as <code class="docutils literal notranslate"><span 
class="pre">SUM(x)</span> <span class="pre">WITHIN</span> <span 
class="pre">GROUP</span> <span class="pre">(ORDER</span> <span 
class="pre">BY</span> <span class="pre">x)</span></code> to plan successfully.
+This behavior was too permissive and has been changed to match PostgreSQL and
+the documented semantics.</p>
+<p>Migration: If your UDAF intentionally implements ordered-set semantics and
+wants to accept the <code class="docutils literal notranslate"><span 
class="pre">WITHIN</span> <span class="pre">GROUP</span></code> SQL syntax, 
update your implementation to
+return <code class="docutils literal notranslate"><span 
class="pre">true</span></code> from <code class="docutils literal 
notranslate"><span class="pre">supports_within_group_clause()</span></code> and 
handle the ordering
+semantics in your accumulator implementation. If your UDAF is merely
+order-sensitive (but not an ordered-set aggregate), do not advertise
+<code class="docutils literal notranslate"><span 
class="pre">supports_within_group_clause()</span></code> and clients should use 
alternative function
+signatures (for example, explicit ordering as a function argument) instead.</p>
+</section>
 <section 
id="aggregateudfimpl-supports-null-handling-clause-now-defaults-to-false">
 <h3><code class="docutils literal notranslate"><span 
class="pre">AggregateUDFImpl::supports_null_handling_clause</span></code> now 
defaults to <code class="docutils literal notranslate"><span 
class="pre">false</span></code><a class="headerlink" 
href="#aggregateudfimpl-supports-null-handling-clause-now-defaults-to-false" 
title="Link to this heading">#</a></h3>
 <p>This method specifies whether an aggregate function allows <code 
class="docutils literal notranslate"><span class="pre">IGNORE</span> <span 
class="pre">NULLS</span></code>/<code class="docutils literal 
notranslate"><span class="pre">RESPECT</span> <span 
class="pre">NULLS</span></code>
@@ -1661,6 +1680,7 @@ take care of constructing the <code class="docutils 
literal notranslate"><span c
   <nav class="bd-toc-nav page-toc" 
aria-labelledby="pst-page-navigation-heading-2">
     <ul class="visible nav section-nav flex-column">
 <li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" 
href="#datafusion-52-0-0">DataFusion <code class="docutils literal 
notranslate"><span class="pre">52.0.0</span></code></a><ul class="nav 
section-nav flex-column">
+<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" 
href="#planner-now-requires-explicit-opt-in-for-within-group-syntax">Planner 
now requires explicit opt-in for WITHIN GROUP syntax</a></li>
 <li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" 
href="#aggregateudfimpl-supports-null-handling-clause-now-defaults-to-false"><code
 class="docutils literal notranslate"><span 
class="pre">AggregateUDFImpl::supports_null_handling_clause</span></code> now 
defaults to <code class="docutils literal notranslate"><span 
class="pre">false</span></code></a></li>
 </ul>
 </li>
diff --git a/searchindex.js b/searchindex.js
index e4d7e16961..cff0b88f54 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[60,"op-neq"]],"!~":[[60,"op-re-not-match"]],"!~*":[[60,"op-re-not-match-i"]],"!~~":[[60,"id19"]],"!~~*":[[60,"id20"]],"#":[[60,"op-bit-xor"]],"%":[[60,"op-modulo"]],"&":[[60,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[60,"op-multiply"]],"+":[[60,"op-plus"]],"-":[[60,"op-minus"]],"/":[[60,"op-divide"]],"<":[[60,"op-lt"]],"<
 [...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[60,"op-neq"]],"!~":[[60,"op-re-not-match"]],"!~*":[[60,"op-re-not-match-i"]],"!~~":[[60,"id19"]],"!~~*":[[60,"id20"]],"#":[[60,"op-bit-xor"]],"%":[[60,"op-modulo"]],"&":[[60,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[60,"op-multiply"]],"+":[[60,"op-plus"]],"-":[[60,"op-minus"]],"/":[[60,"op-divide"]],"<":[[60,"op-lt"]],"<
 [...]
\ No newline at end of file
diff --git a/user-guide/sql/aggregate_functions.html 
b/user-guide/sql/aggregate_functions.html
index af3114c8b9..edf8b6b664 100644
--- a/user-guide/sql/aggregate_functions.html
+++ b/user-guide/sql/aggregate_functions.html
@@ -429,6 +429,31 @@ dev/update_function_docs.sh file for updating surrounding 
text.
 </div>
 <p>Note: When no rows pass the filter, <code class="docutils literal 
notranslate"><span class="pre">COUNT</span></code> returns <code 
class="docutils literal notranslate"><span class="pre">0</span></code> while 
<code class="docutils literal notranslate"><span 
class="pre">SUM</span></code>/<code class="docutils literal notranslate"><span 
class="pre">AVG</span></code>/<code class="docutils literal notranslate"><span 
class="pre">MIN</span></code>/<code class="docutils literal notranslate">< [...]
 </section>
+<section id="within-group-ordered-set-aggregates">
+<h2>WITHIN GROUP / Ordered-set aggregates<a class="headerlink" 
href="#within-group-ordered-set-aggregates" title="Link to this 
heading">#</a></h2>
+<p>Some aggregate functions accept the SQL <code class="docutils literal 
notranslate"><span class="pre">WITHIN</span> <span class="pre">GROUP</span> 
<span class="pre">(ORDER</span> <span class="pre">BY</span> <span 
class="pre">...)</span></code> clause to specify the ordering the
+aggregate relies on. In DataFusion this is opt-in: only aggregate functions 
whose implementation returns
+<code class="docutils literal notranslate"><span 
class="pre">true</span></code> from <code class="docutils literal 
notranslate"><span 
class="pre">AggregateUDFImpl::supports_within_group_clause()</span></code> 
accept the <code class="docutils literal notranslate"><span 
class="pre">WITHIN</span> <span class="pre">GROUP</span></code> clause. 
Attempting to
+use <code class="docutils literal notranslate"><span class="pre">WITHIN</span> 
<span class="pre">GROUP</span></code> with a regular aggregate (for example, 
<code class="docutils literal notranslate"><span class="pre">SELECT</span> 
<span class="pre">SUM(x)</span> <span class="pre">WITHIN</span> <span 
class="pre">GROUP</span> <span class="pre">(ORDER</span> <span 
class="pre">BY</span> <span class="pre">x)</span></code>) will fail
+during planning with an error: “WITHIN GROUP is only supported for ordered-set 
aggregate functions”.</p>
+<p>Currently, the built-in aggregate functions that support <code 
class="docutils literal notranslate"><span class="pre">WITHIN</span> <span 
class="pre">GROUP</span></code> are:</p>
+<ul class="simple">
+<li><p><code class="docutils literal notranslate"><span 
class="pre">percentile_cont</span></code> — exact percentile aggregate (also 
available as <code class="docutils literal notranslate"><span 
class="pre">percentile_cont(column,</span> <span 
class="pre">percentile)</span></code>)</p></li>
+<li><p><code class="docutils literal notranslate"><span 
class="pre">approx_percentile_cont</span></code> — approximate percentile using 
the t-digest algorithm</p></li>
+<li><p><code class="docutils literal notranslate"><span 
class="pre">approx_percentile_cont_with_weight</span></code> — approximate 
weighted percentile using the t-digest algorithm</p></li>
+</ul>
+<p>Note: rank-like functions such as <code class="docutils literal 
notranslate"><span class="pre">rank()</span></code>, <code class="docutils 
literal notranslate"><span class="pre">dense_rank()</span></code>, and <code 
class="docutils literal notranslate"><span 
class="pre">percent_rank()</span></code> are window functions and
+use the <code class="docutils literal notranslate"><span 
class="pre">OVER</span> <span class="pre">(...)</span></code> clause; they are 
not ordered-set aggregates that accept <code class="docutils literal 
notranslate"><span class="pre">WITHIN</span> <span 
class="pre">GROUP</span></code> in DataFusion.</p>
+<p>Example (ordered-set aggregate):</p>
+<div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="n">percentile_cont</span><span 
class="p">(</span><span class="mi">0</span><span class="p">.</span><span 
class="mi">5</span><span class="p">)</span><span class="w"> </span><span 
class="n">WITHIN</span><span class="w"> </span><span 
class="k">GROUP</span><span class="w"> </span><span class="p">(</span><span 
class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span 
class="w"> </spa [...]
+</pre></div>
+</div>
+<p>Example (invalid usage — planner will error):</p>
+<div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="c1">-- This will fail: SUM is 
not an ordered-set aggregate</span>
+<span class="k">SELECT</span><span class="w"> </span><span 
class="k">SUM</span><span class="p">(</span><span class="n">x</span><span 
class="p">)</span><span class="w"> </span><span class="n">WITHIN</span><span 
class="w"> </span><span class="k">GROUP</span><span class="w"> </span><span 
class="p">(</span><span class="k">ORDER</span><span class="w"> </span><span 
class="k">BY</span><span class="w"> </span><span class="n">x</span><span 
class="p">)</span><span class="w"> </span><span class="k" [...]
+</pre></div>
+</div>
+</section>
 <section id="general-functions">
 <h2>General Functions<a class="headerlink" href="#general-functions" 
title="Link to this heading">#</a></h2>
 <ul class="simple">
@@ -1671,6 +1696,7 @@ This aggregation function can only mix DISTINCT and ORDER 
BY if the ordering exp
   <nav class="bd-toc-nav page-toc" 
aria-labelledby="pst-page-navigation-heading-2">
     <ul class="visible nav section-nav flex-column">
 <li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" 
href="#filter-clause">Filter clause</a></li>
+<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" 
href="#within-group-ordered-set-aggregates">WITHIN GROUP / Ordered-set 
aggregates</a></li>
 <li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" 
href="#general-functions">General Functions</a><ul class="nav section-nav 
flex-column">
 <li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" 
href="#array-agg"><code class="docutils literal notranslate"><span 
class="pre">array_agg</span></code></a><ul class="nav section-nav flex-column">
 <li class="toc-h4 nav-item toc-entry"><a class="reference internal nav-link" 
href="#arguments">Arguments</a></li>
diff --git a/user-guide/sql/index.html b/user-guide/sql/index.html
index 52761f2413..bc99e7c381 100644
--- a/user-guide/sql/index.html
+++ b/user-guide/sql/index.html
@@ -462,6 +462,7 @@
 </li>
 <li class="toctree-l1"><a class="reference internal" 
href="aggregate_functions.html">Aggregate Functions</a><ul>
 <li class="toctree-l2"><a class="reference internal" 
href="aggregate_functions.html#filter-clause">Filter clause</a></li>
+<li class="toctree-l2"><a class="reference internal" 
href="aggregate_functions.html#within-group-ordered-set-aggregates">WITHIN 
GROUP / Ordered-set aggregates</a></li>
 <li class="toctree-l2"><a class="reference internal" 
href="aggregate_functions.html#general-functions">General Functions</a></li>
 <li class="toctree-l2"><a class="reference internal" 
href="aggregate_functions.html#statistical-functions">Statistical 
Functions</a></li>
 <li class="toctree-l2"><a class="reference internal" 
href="aggregate_functions.html#approximate-functions">Approximate 
Functions</a></li>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to