Max Gekk created SPARK-57842:
--------------------------------
Summary: Support RANGE window frames with interval bounds over
nanosecond-precision timestamps
Key: SPARK-57842
URL: https://issues.apache.org/jira/browse/SPARK-57842
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.3.0
Reporter: Max Gekk
This sub-task is part of the umbrella SPARK-56822 (timestamps with nanosecond
precision).
h2. Problem
Interval-bounded RANGE frames are only valid for microsecond timestamps:
{{windowExpressions.scala}} {{isValidFrameType}} (~L110-117) allows interval
bounds for {{TimestampType | TimestampNTZType}};
{{WindowFrameTypeCoercion.createBoundaryCast}}
(analysis/TypeCoercionHelper.scala ~L716-724) special-cases {{TimestampType}};
{{WindowEvaluatorFactoryBase}} (~L110-122) builds the bound via microsecond
{{TimestampAddInterval}} / year-month only. A nanosecond ORDER BY column with
{{RANGE BETWEEN INTERVAL ...}} fails with {{RANGE_FRAME_INVALID_TYPE}}.
h2. Goal
Allow a nanosecond ordering column with interval RANGE bounds, computing
boundary values with the nanosecond-aware arithmetic (day-time interval done in
SPARK-57501; year-month via SPARK-57825), preserving the remainder.
h2. Scope
Extend the three sites above to accept {{AnyTimestampNanoType}}; route boundary
arithmetic through the nanosecond-aware add expression.
h2. Acceptance criteria
* {{OVER (ORDER BY ts_nanos RANGE BETWEEN INTERVAL '1' DAY PRECEDING AND
CURRENT ROW)}} analyzes and evaluates correctly; ROW frames already work.
h2. Testing
{{DataFrameWindowFunctionsSuite}} / window SQL golden files.
h2. Dependencies
Do AFTER SPARK-57825 (year-month interval bound arithmetic); day-time bound
uses resolved SPARK-57501.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]