This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 783ee19fef Publish built docs triggered by
b477816c8636e171ae000522a2cf0951997b06fb
783ee19fef is described below
commit 783ee19fef2f2fa6dd659a60336a68257156a4e1
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Fri Nov 14 02:47:41 2025 +0000
Publish built docs triggered by b477816c8636e171ae000522a2cf0951997b06fb
---
_sources/library-user-guide/upgrading.md.txt | 21 +++++++++++++++
_sources/user-guide/sql/aggregate_functions.md.txt | 30 ++++++++++++++++++++++
library-user-guide/upgrading.html | 20 +++++++++++++++
searchindex.js | 2 +-
user-guide/sql/aggregate_functions.html | 26 +++++++++++++++++++
user-guide/sql/index.html | 1 +
6 files changed, 99 insertions(+), 1 deletion(-)
diff --git a/_sources/library-user-guide/upgrading.md.txt
b/_sources/library-user-guide/upgrading.md.txt
index 7cc84f424b..8288923008 100644
--- a/_sources/library-user-guide/upgrading.md.txt
+++ b/_sources/library-user-guide/upgrading.md.txt
@@ -25,6 +25,27 @@
You can see the current [status of the `52.0.0` release
here](https://github.com/apache/datafusion/issues/18566)
+### Planner now requires explicit opt-in for WITHIN GROUP syntax
+
+The SQL planner now enforces the aggregate UDF contract more strictly: the
+`WITHIN GROUP (ORDER BY ...)` syntax is accepted only if the aggregate UDAF
+explicitly advertises support by returning `true` from
+`AggregateUDFImpl::supports_within_group_clause()`.
+
+Previously the planner forwarded a `WITHIN GROUP` clause to order-sensitive
+aggregates even when they did not implement ordered-set semantics, which could
+cause queries such as `SUM(x) WITHIN GROUP (ORDER BY x)` to plan successfully.
+This behavior was too permissive and has been changed to match PostgreSQL and
+the documented semantics.
+
+Migration: If your UDAF intentionally implements ordered-set semantics and
+wants to accept the `WITHIN GROUP` SQL syntax, update your implementation to
+return `true` from `supports_within_group_clause()` and handle the ordering
+semantics in your accumulator implementation. If your UDAF is merely
+order-sensitive (but not an ordered-set aggregate), do not advertise
+`supports_within_group_clause()` and clients should use alternative function
+signatures (for example, explicit ordering as a function argument) instead.
+
### `AggregateUDFImpl::supports_null_handling_clause` now defaults to `false`
This method specifies whether an aggregate function allows `IGNORE
NULLS`/`RESPECT NULLS`
diff --git a/_sources/user-guide/sql/aggregate_functions.md.txt
b/_sources/user-guide/sql/aggregate_functions.md.txt
index f17e09f2ce..ba9c6ae124 100644
--- a/_sources/user-guide/sql/aggregate_functions.md.txt
+++ b/_sources/user-guide/sql/aggregate_functions.md.txt
@@ -48,6 +48,36 @@ FROM employees;
Note: When no rows pass the filter, `COUNT` returns `0` while
`SUM`/`AVG`/`MIN`/`MAX` return `NULL`.
+## WITHIN GROUP / Ordered-set aggregates
+
+Some aggregate functions accept the SQL `WITHIN GROUP (ORDER BY ...)` clause
to specify the ordering the
+aggregate relies on. In DataFusion this is opt-in: only aggregate functions
whose implementation returns
+`true` from `AggregateUDFImpl::supports_within_group_clause()` accept the
`WITHIN GROUP` clause. Attempting to
+use `WITHIN GROUP` with a regular aggregate (for example, `SELECT SUM(x)
WITHIN GROUP (ORDER BY x)`) will fail
+during planning with an error: "WITHIN GROUP is only supported for ordered-set
aggregate functions".
+
+Currently, the built-in aggregate functions that support `WITHIN GROUP` are:
+
+- `percentile_cont` — exact percentile aggregate (also available as
`percentile_cont(column, percentile)`)
+- `approx_percentile_cont` — approximate percentile using the t-digest
algorithm
+- `approx_percentile_cont_with_weight` — approximate weighted percentile using
the t-digest algorithm
+
+Note: rank-like functions such as `rank()`, `dense_rank()`, and
`percent_rank()` are window functions and
+use the `OVER (...)` clause; they are not ordered-set aggregates that accept
`WITHIN GROUP` in DataFusion.
+
+Example (ordered-set aggregate):
+
+```sql
+percentile_cont(0.5) WITHIN GROUP (ORDER BY value)
+```
+
+Example (invalid usage — planner will error):
+
+```sql
+-- This will fail: SUM is not an ordered-set aggregate
+SELECT SUM(x) WITHIN GROUP (ORDER BY x) FROM t;
+```
+
## General Functions
- [array_agg](#array_agg)
diff --git a/library-user-guide/upgrading.html
b/library-user-guide/upgrading.html
index 67b66e3dd3..496fe152c0 100644
--- a/library-user-guide/upgrading.html
+++ b/library-user-guide/upgrading.html
@@ -407,6 +407,25 @@
<h2>DataFusion <code class="docutils literal notranslate"><span
class="pre">52.0.0</span></code><a class="headerlink" href="#datafusion-52-0-0"
title="Link to this heading">#</a></h2>
<p><strong>Note:</strong> DataFusion <code class="docutils literal
notranslate"><span class="pre">52.0.0</span></code> has not been released yet.
The information provided in this section pertains to features and changes that
have already been merged to the main branch and are awaiting release in this
version.</p>
<p>You can see the current <a class="reference external"
href="https://github.com/apache/datafusion/issues/18566">status of the <code
class="docutils literal notranslate"><span class="pre">52.0.0</span></code>
release here</a></p>
+<section id="planner-now-requires-explicit-opt-in-for-within-group-syntax">
+<h3>Planner now requires explicit opt-in for WITHIN GROUP syntax<a
class="headerlink"
href="#planner-now-requires-explicit-opt-in-for-within-group-syntax"
title="Link to this heading">#</a></h3>
+<p>The SQL planner now enforces the aggregate UDF contract more strictly: the
+<code class="docutils literal notranslate"><span class="pre">WITHIN</span>
<span class="pre">GROUP</span> <span class="pre">(ORDER</span> <span
class="pre">BY</span> <span class="pre">...)</span></code> syntax is accepted
only if the aggregate UDAF
+explicitly advertises support by returning <code class="docutils literal
notranslate"><span class="pre">true</span></code> from
+<code class="docutils literal notranslate"><span
class="pre">AggregateUDFImpl::supports_within_group_clause()</span></code>.</p>
+<p>Previously the planner forwarded a <code class="docutils literal
notranslate"><span class="pre">WITHIN</span> <span
class="pre">GROUP</span></code> clause to order-sensitive
+aggregates even when they did not implement ordered-set semantics, which could
+cause queries such as <code class="docutils literal notranslate"><span
class="pre">SUM(x)</span> <span class="pre">WITHIN</span> <span
class="pre">GROUP</span> <span class="pre">(ORDER</span> <span
class="pre">BY</span> <span class="pre">x)</span></code> to plan successfully.
+This behavior was too permissive and has been changed to match PostgreSQL and
+the documented semantics.</p>
+<p>Migration: If your UDAF intentionally implements ordered-set semantics and
+wants to accept the <code class="docutils literal notranslate"><span
class="pre">WITHIN</span> <span class="pre">GROUP</span></code> SQL syntax,
update your implementation to
+return <code class="docutils literal notranslate"><span
class="pre">true</span></code> from <code class="docutils literal
notranslate"><span class="pre">supports_within_group_clause()</span></code> and
handle the ordering
+semantics in your accumulator implementation. If your UDAF is merely
+order-sensitive (but not an ordered-set aggregate), do not advertise
+<code class="docutils literal notranslate"><span
class="pre">supports_within_group_clause()</span></code> and clients should use
alternative function
+signatures (for example, explicit ordering as a function argument) instead.</p>
+</section>
<section
id="aggregateudfimpl-supports-null-handling-clause-now-defaults-to-false">
<h3><code class="docutils literal notranslate"><span
class="pre">AggregateUDFImpl::supports_null_handling_clause</span></code> now
defaults to <code class="docutils literal notranslate"><span
class="pre">false</span></code><a class="headerlink"
href="#aggregateudfimpl-supports-null-handling-clause-now-defaults-to-false"
title="Link to this heading">#</a></h3>
<p>This method specifies whether an aggregate function allows <code
class="docutils literal notranslate"><span class="pre">IGNORE</span> <span
class="pre">NULLS</span></code>/<code class="docutils literal
notranslate"><span class="pre">RESPECT</span> <span
class="pre">NULLS</span></code>
@@ -1661,6 +1680,7 @@ take care of constructing the <code class="docutils
literal notranslate"><span c
<nav class="bd-toc-nav page-toc"
aria-labelledby="pst-page-navigation-heading-2">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link"
href="#datafusion-52-0-0">DataFusion <code class="docutils literal
notranslate"><span class="pre">52.0.0</span></code></a><ul class="nav
section-nav flex-column">
+<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link"
href="#planner-now-requires-explicit-opt-in-for-within-group-syntax">Planner
now requires explicit opt-in for WITHIN GROUP syntax</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link"
href="#aggregateudfimpl-supports-null-handling-clause-now-defaults-to-false"><code
class="docutils literal notranslate"><span
class="pre">AggregateUDFImpl::supports_null_handling_clause</span></code> now
defaults to <code class="docutils literal notranslate"><span
class="pre">false</span></code></a></li>
</ul>
</li>
diff --git a/searchindex.js b/searchindex.js
index e4d7e16961..cff0b88f54 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[60,"op-neq"]],"!~":[[60,"op-re-not-match"]],"!~*":[[60,"op-re-not-match-i"]],"!~~":[[60,"id19"]],"!~~*":[[60,"id20"]],"#":[[60,"op-bit-xor"]],"%":[[60,"op-modulo"]],"&":[[60,"op-bit-and"]],"(relation,
name) tuples in logical fields and logical columns are
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[60,"op-multiply"]],"+":[[60,"op-plus"]],"-":[[60,"op-minus"]],"/":[[60,"op-divide"]],"<":[[60,"op-lt"]],"<
[...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[60,"op-neq"]],"!~":[[60,"op-re-not-match"]],"!~*":[[60,"op-re-not-match-i"]],"!~~":[[60,"id19"]],"!~~*":[[60,"id20"]],"#":[[60,"op-bit-xor"]],"%":[[60,"op-modulo"]],"&":[[60,"op-bit-and"]],"(relation,
name) tuples in logical fields and logical columns are
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[60,"op-multiply"]],"+":[[60,"op-plus"]],"-":[[60,"op-minus"]],"/":[[60,"op-divide"]],"<":[[60,"op-lt"]],"<
[...]
\ No newline at end of file
diff --git a/user-guide/sql/aggregate_functions.html
b/user-guide/sql/aggregate_functions.html
index af3114c8b9..edf8b6b664 100644
--- a/user-guide/sql/aggregate_functions.html
+++ b/user-guide/sql/aggregate_functions.html
@@ -429,6 +429,31 @@ dev/update_function_docs.sh file for updating surrounding
text.
</div>
<p>Note: When no rows pass the filter, <code class="docutils literal
notranslate"><span class="pre">COUNT</span></code> returns <code
class="docutils literal notranslate"><span class="pre">0</span></code> while
<code class="docutils literal notranslate"><span
class="pre">SUM</span></code>/<code class="docutils literal notranslate"><span
class="pre">AVG</span></code>/<code class="docutils literal notranslate"><span
class="pre">MIN</span></code>/<code class="docutils literal notranslate">< [...]
</section>
+<section id="within-group-ordered-set-aggregates">
+<h2>WITHIN GROUP / Ordered-set aggregates<a class="headerlink"
href="#within-group-ordered-set-aggregates" title="Link to this
heading">#</a></h2>
+<p>Some aggregate functions accept the SQL <code class="docutils literal
notranslate"><span class="pre">WITHIN</span> <span class="pre">GROUP</span>
<span class="pre">(ORDER</span> <span class="pre">BY</span> <span
class="pre">...)</span></code> clause to specify the ordering the
+aggregate relies on. In DataFusion this is opt-in: only aggregate functions
whose implementation returns
+<code class="docutils literal notranslate"><span
class="pre">true</span></code> from <code class="docutils literal
notranslate"><span
class="pre">AggregateUDFImpl::supports_within_group_clause()</span></code>
accept the <code class="docutils literal notranslate"><span
class="pre">WITHIN</span> <span class="pre">GROUP</span></code> clause.
Attempting to
+use <code class="docutils literal notranslate"><span class="pre">WITHIN</span>
<span class="pre">GROUP</span></code> with a regular aggregate (for example,
<code class="docutils literal notranslate"><span class="pre">SELECT</span>
<span class="pre">SUM(x)</span> <span class="pre">WITHIN</span> <span
class="pre">GROUP</span> <span class="pre">(ORDER</span> <span
class="pre">BY</span> <span class="pre">x)</span></code>) will fail
+during planning with an error: “WITHIN GROUP is only supported for ordered-set
aggregate functions”.</p>
+<p>Currently, the built-in aggregate functions that support <code
class="docutils literal notranslate"><span class="pre">WITHIN</span> <span
class="pre">GROUP</span></code> are:</p>
+<ul class="simple">
+<li><p><code class="docutils literal notranslate"><span
class="pre">percentile_cont</span></code> — exact percentile aggregate (also
available as <code class="docutils literal notranslate"><span
class="pre">percentile_cont(column,</span> <span
class="pre">percentile)</span></code>)</p></li>
+<li><p><code class="docutils literal notranslate"><span
class="pre">approx_percentile_cont</span></code> — approximate percentile using
the t-digest algorithm</p></li>
+<li><p><code class="docutils literal notranslate"><span
class="pre">approx_percentile_cont_with_weight</span></code> — approximate
weighted percentile using the t-digest algorithm</p></li>
+</ul>
+<p>Note: rank-like functions such as <code class="docutils literal
notranslate"><span class="pre">rank()</span></code>, <code class="docutils
literal notranslate"><span class="pre">dense_rank()</span></code>, and <code
class="docutils literal notranslate"><span
class="pre">percent_rank()</span></code> are window functions and
+use the <code class="docutils literal notranslate"><span
class="pre">OVER</span> <span class="pre">(...)</span></code> clause; they are
not ordered-set aggregates that accept <code class="docutils literal
notranslate"><span class="pre">WITHIN</span> <span
class="pre">GROUP</span></code> in DataFusion.</p>
+<p>Example (ordered-set aggregate):</p>
+<div class="highlight-sql notranslate"><div
class="highlight"><pre><span></span><span class="n">percentile_cont</span><span
class="p">(</span><span class="mi">0</span><span class="p">.</span><span
class="mi">5</span><span class="p">)</span><span class="w"> </span><span
class="n">WITHIN</span><span class="w"> </span><span
class="k">GROUP</span><span class="w"> </span><span class="p">(</span><span
class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span
class="w"> </spa [...]
+</pre></div>
+</div>
+<p>Example (invalid usage — planner will error):</p>
+<div class="highlight-sql notranslate"><div
class="highlight"><pre><span></span><span class="c1">-- This will fail: SUM is
not an ordered-set aggregate</span>
+<span class="k">SELECT</span><span class="w"> </span><span
class="k">SUM</span><span class="p">(</span><span class="n">x</span><span
class="p">)</span><span class="w"> </span><span class="n">WITHIN</span><span
class="w"> </span><span class="k">GROUP</span><span class="w"> </span><span
class="p">(</span><span class="k">ORDER</span><span class="w"> </span><span
class="k">BY</span><span class="w"> </span><span class="n">x</span><span
class="p">)</span><span class="w"> </span><span class="k" [...]
+</pre></div>
+</div>
+</section>
<section id="general-functions">
<h2>General Functions<a class="headerlink" href="#general-functions"
title="Link to this heading">#</a></h2>
<ul class="simple">
@@ -1671,6 +1696,7 @@ This aggregation function can only mix DISTINCT and ORDER
BY if the ordering exp
<nav class="bd-toc-nav page-toc"
aria-labelledby="pst-page-navigation-heading-2">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link"
href="#filter-clause">Filter clause</a></li>
+<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link"
href="#within-group-ordered-set-aggregates">WITHIN GROUP / Ordered-set
aggregates</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link"
href="#general-functions">General Functions</a><ul class="nav section-nav
flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link"
href="#array-agg"><code class="docutils literal notranslate"><span
class="pre">array_agg</span></code></a><ul class="nav section-nav flex-column">
<li class="toc-h4 nav-item toc-entry"><a class="reference internal nav-link"
href="#arguments">Arguments</a></li>
diff --git a/user-guide/sql/index.html b/user-guide/sql/index.html
index 52761f2413..bc99e7c381 100644
--- a/user-guide/sql/index.html
+++ b/user-guide/sql/index.html
@@ -462,6 +462,7 @@
</li>
<li class="toctree-l1"><a class="reference internal"
href="aggregate_functions.html">Aggregate Functions</a><ul>
<li class="toctree-l2"><a class="reference internal"
href="aggregate_functions.html#filter-clause">Filter clause</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="aggregate_functions.html#within-group-ordered-set-aggregates">WITHIN
GROUP / Ordered-set aggregates</a></li>
<li class="toctree-l2"><a class="reference internal"
href="aggregate_functions.html#general-functions">General Functions</a></li>
<li class="toctree-l2"><a class="reference internal"
href="aggregate_functions.html#statistical-functions">Statistical
Functions</a></li>
<li class="toctree-l2"><a class="reference internal"
href="aggregate_functions.html#approximate-functions">Approximate
Functions</a></li>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]