crm26 opened a new pull request, #23168:
URL: https://github.com/apache/datafusion/pull/23168

   ## Summary
   
   Adds `array_avg`, the last function from the split-PR pipeline tracked in 
#21536.
   
   Computes the arithmetic mean (sum divided by count) of the elements of a 
numeric array, returned as `Float64`. Templated from the merged `array_sum` 
(#22542) with the same SQL aggregate NULL conventions; sibling of `array_sum`, 
`array_product`, `array_subtract`, `array_add`, `array_scale`, and 
`array_normalize`.
   
   ## Semantics
   
   **NULL semantics — SQL aggregate convention (deliberate divergence from 
binary-op siblings):**
   - NULL row → NULL row out
   - NULL elements are **skipped** from BOTH the sum and the count, matching 
PostgreSQL `AVG`, DuckDB `list_avg`, Spark `aggregate`. So `array_avg([1, NULL, 
3]) = (1+3) / 2 = 2`, not `(1+3) / 3`.
   - All-NULL row → NULL out (matches `AVG(...)` over an all-NULL column)
   - **Empty array → NULL** (matches sibling `array_sum` #22542 and 
`array_product` #22703, PostgreSQL, DuckDB `list_avg`, SQL Standard 
AVG-of-empty-set)
   
   **Type coercion:**
   - Inner numeric types (`Float32`, `Int*`, `UInt*`) coerced to `Float64`. 
Integer-arg literals are coerced too.
   - Return type is always `Float64` (since avg of integers can be non-integer).
   
   **List shapes supported:** `List`, `LargeList`, `FixedSizeList`.
   
   **Alias:** `list_avg` (matches the `list_sum` / `list_product` pattern).
   
   ## Test coverage
   
   SLT (`array_avg.slt`) covers:
   - Happy paths (basic, single element, negative values, cancelling 
positive/negative, non-integer mean `[1,2] → 1.5`)
   - All NULL shapes: bare `NULL` row, NULL elements skipped, single non-NULL 
among NULLs, all-NULL array
   - Empty array → NULL
   - All 3 list shapes (`List`, `LargeList`, `FixedSizeList`)
   - Float32, Int64 inner types, integer literals, integer mean that is 
non-integer
   - Multi-row mix
   - Error paths (non-list input, zero args, two args)
   - Return type assertion (`Float64`)
   - `list_avg` alias
   
   ## Closes
   
   The last open slot in #21536 — completes the 8-function split-PR pipeline 
from the originally-too-big PRs #21371 / #21376.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to