[
https://issues.apache.org/jira/browse/SPARK-57830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk updated SPARK-57830:
-----------------------------
Shepherd: Max Gekk
> Support event-time watermark on nanosecond-precision timestamp columns
> ----------------------------------------------------------------------
>
> Key: SPARK-57830
> URL: https://issues.apache.org/jira/browse/SPARK-57830
> Project: Spark
> Issue Type: Sub-task
> Components: Structured Streaming
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Priority: Major
>
> This sub-task is part of the umbrella SPARK-56822 (timestamps with nanosecond
> precision).
> h2. Problem
> {{CheckAnalysis}} (analysis/CheckAnalysis.scala ~L651-661) accepts an
> event-time column only if it is {{TimestampType}} (or a window struct whose
> {{end}} is {{TimestampType}}); a nanosecond event-time column fails with
> {{EVENT_TIME_IS_NOT_ON_TIMESTAMP_TYPE}}. Downstream, the watermark predicate
> in {{statefulOperators.scala}} (~L672-680) builds {{Literal(watermarkMs *
> 1000)}} and compares it as a microsecond {{Long}}, and dedup-within-watermark
> reads the event time via {{getLong}} - incompatible with the 16-byte
> {{TimestampNanosVal}}.
> h2. Goal
> Allow a nanosecond timestamp column as the event-time / watermark column,
> with the watermark threshold compared correctly against the nanosecond value.
> h2. Scope
> Extend the {{CheckAnalysis}} event-time type check to accept
> {{AnyTimestampNanoType}}; make the watermark predicate and eviction
> read/compare the nanosecond value (epoch micros + remainder) rather than
> assuming a microsecond {{Long}}.
> h2. Acceptance criteria
> * {{withWatermark}} on a nanosecond column analyzes; late-record dropping and
> watermark advancement are correct to nanosecond resolution.
> h2. Testing
> {{EventTimeWatermarkSuite}} and streaming dedup tests with nanosecond event
> time.
> h2. Dependencies
> None hard. PREREQ for the streaming stateful-operators sub-task.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]