[ 
https://issues.apache.org/jira/browse/SPARK-57023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-57023:
-----------------------------
    Description: 
h3. Problem

When a decimal aggregate input is wrapped by a widening, scale-preserving 
{{Cast}} (e.g. {{Cast(d as Decimal(p+k, s))}} where {{k >= 0}} and scale 
unchanged), {{MIN}}/{{MAX}} computes the result in the wider type even though 
the cast is order-preserving and the min/max value is bit-identical in the 
narrower type. This bloats codegen / shuffle payload and wastes a 
{{Decimal.changePrecision}} per row.

{{SUM}} and {{AVG}} already peel this pattern in {{DecimalAggregates}} (see 
SPARK-3933 for the original arm and SPARK-56983 for the {{evalMode}}-aware 
follow-up). {{MIN}}/{{MAX}} were never extended.

h3. Proposal

Extend {{DecimalAggregates}} with a {{MIN}}/{{MAX}} arm that peels the widening 
{{Cast}} when:

* child precision/scale fit, and
* {{evalMode}} of the surrounding aggregate is preserved (mirrors SPARK-56983 
semantics).

Refactor the existing SUM/AVG arms to share a {{WidenedDecimalChild}} extractor 
(private object) so the new MIN/MAX arm reuses the predicate.

h3. Scope (non-goals)

* No changes to ANSI overflow semantics — peel is bit-identical only when scale 
is preserved and child precision fits.
* No new SQL surface, no new SQLConf.

h3. References

* SPARK-3933 — original DecimalAggregates SUM/AVG peel
* SPARK-56983 — evalMode-preserving variant (sibling implementation)

         Labels: decimal optimizer  (was: )
       Priority: Minor  (was: Major)

> Peel scale-preserving widening decimal Cast in front of MIN/MAX
> ---------------------------------------------------------------
>
>                 Key: SPARK-57023
>                 URL: https://issues.apache.org/jira/browse/SPARK-57023
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 5.0.0
>            Reporter: Kent Yao
>            Priority: Minor
>              Labels: decimal, optimizer
>
> h3. Problem
> When a decimal aggregate input is wrapped by a widening, scale-preserving 
> {{Cast}} (e.g. {{Cast(d as Decimal(p+k, s))}} where {{k >= 0}} and scale 
> unchanged), {{MIN}}/{{MAX}} computes the result in the wider type even though 
> the cast is order-preserving and the min/max value is bit-identical in the 
> narrower type. This bloats codegen / shuffle payload and wastes a 
> {{Decimal.changePrecision}} per row.
> {{SUM}} and {{AVG}} already peel this pattern in {{DecimalAggregates}} (see 
> SPARK-3933 for the original arm and SPARK-56983 for the {{evalMode}}-aware 
> follow-up). {{MIN}}/{{MAX}} were never extended.
> h3. Proposal
> Extend {{DecimalAggregates}} with a {{MIN}}/{{MAX}} arm that peels the 
> widening {{Cast}} when:
> * child precision/scale fit, and
> * {{evalMode}} of the surrounding aggregate is preserved (mirrors SPARK-56983 
> semantics).
> Refactor the existing SUM/AVG arms to share a {{WidenedDecimalChild}} 
> extractor (private object) so the new MIN/MAX arm reuses the predicate.
> h3. Scope (non-goals)
> * No changes to ANSI overflow semantics — peel is bit-identical only when 
> scale is preserved and child precision fits.
> * No new SQL surface, no new SQLConf.
> h3. References
> * SPARK-3933 — original DecimalAggregates SUM/AVG peel
> * SPARK-56983 — evalMode-preserving variant (sibling implementation)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to