[jira] [Created] (SPARK-57162) Add nanosecond-aware TimestampFormatter for parsing and formatting TimestampNanosVal with precision

Max Gekk (Jira) Sat, 30 May 2026 01:11:14 -0700

Max Gekk created SPARK-57162:
--------------------------------

             Summary: Add nanosecond-aware TimestampFormatter for parsing and 
formatting TimestampNanosVal with precision
                 Key: SPARK-57162
                 URL: https://issues.apache.org/jira/browse/SPARK-57162
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk



h2. What

Extend the {{TimestampFormatter}} family so it can parse a string into
{{org.apache.spark.unsafe.types.TimestampNanosVal}} ({{epochMicros: Long}} +
{{nanosWithinMicro: Short}} in [0, 999]) and format a {{TimestampNanosVal}} 
back to a string
with a target fractional precision {{p}} in [7, 9].

Parent: SPARK-56822. Builds on SPARK-57032 (raw string parsing for nanosecond 
fractional
precision), which covers only {{SparkDateTimeUtils.parseTimestampString}}, not 
the
pattern-based / format (write) side used by datasources.

h2. Why

Today {{TimestampFormatter}} is microsecond-only: every {{parse}} /
{{parseWithoutTimeZone}} returns a {{Long}} of epoch microseconds, and every 
{{format}}
overload consumes microseconds. {{Iso8601TimestampFormatter.extractMicros}} 
reads
{{ChronoField.MICRO_OF_SECOND}}, discarding the 7th-9th fractional digits, and 
the legacy
{{FAST_DATE_FORMAT}} path caps at millisecond/microsecond resolution. There is 
no API that
yields or consumes {{TimestampNanosVal}}.

The JSON and CSV datasources (and other text-based paths) drive all timestamp 
parsing and
formatting through {{TimestampFormatter}} with user-supplied 
{{timestampFormat}} patterns,
so they cannot round-trip 7-9 digit fractions until the formatter is 
nanos-aware. This
ticket is the foundational unblocker for nanosecond support in those 
datasources.

h2. Scope

{{sql/api/.../util/TimestampFormatter.scala}}
* Add nanos-aware parse methods returning {{TimestampNanosVal}} (LTZ and NTZ /
without-time-zone variants), and {{Optional}} counterparts mirroring 
{{parseOptional}} /
{{parseWithoutTimeZoneOptional}}.
* Add format methods accepting {{TimestampNanosVal}} plus the target precision 
{{p}}, with
defined truncation/rounding of sub-precision digits.
* Cover the implementations: {{Iso8601TimestampFormatter}} (extend 
{{extractMicros}} to also
capture {{NANO_OF_SECOND}} remainder), {{DefaultTimestampFormatter}} (delegate 
to the
SPARK-57032 nanos parse), and the legacy {{LegacyFastTimestampFormatter}} 
(define behavior
or explicitly reject nanos in LEGACY mode).
* Support fraction patterns up to 9 digits ({{[.SSSSSSS]}} .. {{[.SSSSSSSSS]}}) 
in both parse
and format ({{DateTimeFormatterHelper}} already appends {{NANO_OF_SECOND}} 
0..9).

h2. Out of scope

* JSON/CSV converter and schema-inference wiring (separate sub-tasks; they 
depend on this).
* Raw string parsing already handled by SPARK-57032.
* Datasource option additions.

h2. Design notes

* Precision {{p}} controls how many fractional digits are emitted on format and 
how
sub-precision input is handled on parse (truncate vs round) - document and test 
the chosen
rule.
* Reuse the existing {{TimestampNanosVal}} normalization invariant 
(nanosWithinMicro in
[0, 999]); carry overflow into {{epochMicros}}.
* Keep all existing microsecond methods unchanged (additive API).

h2. How was this patch tested

* {{TimestampFormatterSuite}} (or new cases): parse/format round-trip for p in 
[7, 9] across
ISO default and custom patterns; boundary values (nanosWithinMicro 0 and 999, 
pre-epoch
instants, Long micro boundaries); LEGACY-mode behavior; truncation/rounding 
rule.

h2. Does this PR introduce any user-facing change

No. Additive formatter API gated for use behind 
{{spark.sql.timestampNanosTypes.enabled}} by
its callers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-57162) Add nanosecond-aware TimestampFormatter for parsing and formatting TimestampNanosVal with precision

Reply via email to