Max Gekk created SPARK-57551:
--------------------------------
Summary: Extend the TIME data type precision to nanoseconds (up to
9)
Key: SPARK-57551
URL: https://issues.apache.org/jira/browse/SPARK-57551
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.3.0
Reporter: Max Gekk
h2. What
Extend the fractional-seconds precision of the {{TIME}} data type from the
current
maximum of 6 (microseconds) to 9 (nanoseconds). After this change {{TIME(p)}}
accepts
{{0 <= p <= 9}}.
h2. Why
* Internal storage is *already* nanoseconds-since-midnight ({{Long}}),
introduced by
SPARK-52460. {{TimeType.NANOS_PRECISION = 9}} is already defined; only the cap
{{TimeType.MAX_PRECISION = 6}} prevents using it.
* ANSI SQL (ISO/IEC 9075-2, 6.1 <data type>) makes the maximum {{<time
precision>}}
implementation-defined with the sole constraint that it is *not less than 6*,
and
Syntax Rule 36 requires the maximum of {{<time precision>}} and {{<timestamp
precision>}}
to be *the same* implementation-defined value.
* This worktree already supports nanosecond timestamps via
{{TimestampNTZNanosType}} /
{{TimestampLTZNanos}} (precision 7..9). To stay ANSI-consistent, {{TIME}}
must reach
precision 9 in lockstep.
h2. Scope
* Lift {{TimeType.MAX_PRECISION}} from 6 to 9 and update precision validation in
{{TimeType}} and {{DataTypeAstBuilder}}.
* Update {{SparkDateTimeUtils.truncateTimeToPrecision}} (and its {{<=
MAX_PRECISION}}
assertion) to support p in 7..9.
* Time formatters/parsers ({{TimeFormatter}}, {{FractionTimeFormatter}}) must
format and
parse 7..9 fractional digits.
* Parquet I/O: the writer currently emits the {{TIME(MICROS)}} logical type;
emit
{{TIME(NANOS)}} for p in 7..9 and read it back ({{TimeTypeParquetOps}},
{{ParquetSchemaConverter}}, {{ParquetWriteSupport}}, {{ParquetRowConverter}},
vectorized reader).
* Verify casts already implemented for TIME (TIME(p1)->TIME(p2), TIME->DECIMAL,
TIME->integral, STRING<->TIME) behave correctly for p in 7..9.
h2. Out of scope
* Casts to/from TIMESTAMP types (tracked separately).
* TIME WITH TIME ZONE (non-goal per SPARK-51162).
h2. Acceptance criteria
* {{TIME(7)}}, {{TIME(8)}}, {{TIME(9)}} can be declared, parsed, and used as
literals.
* Round-trip through Parquet preserves nanosecond values.
* Existing TIME tests pass; new tests cover the 7..9 range.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]