[ 
https://issues.apache.org/jira/browse/SPARK-57826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-57826:
-----------------------------
    Shepherd: Max Gekk

> Support nanosecond-precision timestamps in 
> approx_percentile/percentile_approx and histogram_numeric
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-57826
>                 URL: https://issues.apache.org/jira/browse/SPARK-57826
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Max Gekk
>            Priority: Major
>
> This sub-task is part of the umbrella SPARK-56822 (timestamps with nanosecond 
> precision).
> h2. Problem
> {{ApproximatePercentile}} (aggregate/ApproximatePercentile.scala ~L104-209) 
> lists {{TimestampType}}, {{TimestampNTZType}}, {{AnyTimeType}} in 
> {{inputTypes}} but omits {{AnyTimestampNanoType}}; its value path does 
> {{value.toDouble}} / result {{.toLong}}, which is wrong for 
> {{TimestampNanosVal}} (not a {{Number}}). {{HistogramNumeric}} 
> (aggregate/HistogramNumeric.scala ~L80-168) has the same pattern 
> ({{asInstanceOf[Number].doubleValue()}}). Microsecond timestamps work; 
> nanosecond types are rejected at analysis and would also fail at runtime. 
> This mirrors the TIME extension added by SPARK-57557.
> h2. Goal
> Accept {{AnyTimestampNanoType}} in both aggregates and convert 
> {{TimestampNanosVal}} to/from the internal double representation without 
> losing sub-microsecond precision (e.g. via epoch seconds + fractional nanos), 
> returning a nanosecond timestamp at the input precision/family.
> h2. Scope
> Extend {{inputTypes}}; add nanosecond value<->double conversion in 
> update/merge/eval; preserve precision on the result type. Follow the 
> SPARK-57557 pattern.
> h2. Acceptance criteria
> * {{approx_percentile}} / {{percentile_approx}} / {{histogram_numeric}} over 
> NTZ/LTZ nanosecond timestamps return correctly-typed nanosecond results; 
> accuracy comparable to the microsecond path.
> h2. Testing
> {{ApproximatePercentileQuerySuite}}, {{HistogramNumericSuite}}; nanos golden 
> files.
> h2. Dependencies
> None - independent (mirrors the TIME work SPARK-57557).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to