Max Gekk created SPARK-57829:
--------------------------------
Summary: Support window, session_window and window_time over
nanosecond-precision timestamps
Key: SPARK-57829
URL: https://issues.apache.org/jira/browse/SPARK-57829
Project: Spark
Issue Type: Sub-task
Components: Structured Streaming
Affects Versions: 4.3.0
Reporter: Max Gekk
This sub-task is part of the umbrella SPARK-56822 (timestamps with nanosecond
precision).
h2. Problem
{{TimeWindow}} (expressions/TimeWindow.scala ~L100-111) and {{SessionWindow}}
(expressions/SessionWindow.scala ~L70-81) accept only {{AnyTimestampType}}
(microsecond) time columns, and window resolution ({{TimeWindowResolution}})
uses the identity microsecond {{PreciseTimestampConversion}}. {{window_time}}
inherits the window struct element type. So nanosecond time columns are
rejected at analysis and bucketing is microsecond-based. Applies to both batch
and streaming.
h2. Goal
Support nanosecond time columns in tumbling/sliding {{window}},
{{session_window}}, and {{window_time}}, with bucket boundaries computed at the
source precision.
h2. Scope
Accept {{AnyTimestampNanoType}} in the window input-type checks; extend the
resolution/rewrite to compute buckets from {{TimestampNanosVal}}; ensure the
produced window struct {{start}} / {{end}} types are consistent (nanosecond, or
a documented microsecond rounding).
h2. Acceptance criteria
* {{GROUP BY window(ts_nanos, '1 second')}} and {{session_window(ts_nanos,
...)}} analyze and produce correct buckets; {{window_time}} returns a
consistent type.
h2. Testing
{{DataFrameTimeWindowingSuite}}, {{DataFrameSessionWindowingSuite}}; streaming
window tests.
h2. Dependencies
None hard (day-time bucketing via resolved SPARK-57501; reuses the year-month
interval sub-task for the year-month case). PREREQ for the streaming
stateful-operators sub-task.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]