Max Gekk created SPARK-57163:
--------------------------------
Summary: Map TIMESTAMP_LTZ(6) and TIMESTAMP_NTZ(6) to
TimestampType and TimestampNTZType
Key: SPARK-57163
URL: https://issues.apache.org/jira/browse/SPARK-57163
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.3.0
Reporter: Max Gekk
h2. What
Map the microsecond fractional precision {{6}} of the parameterized timestamp
spellings to the existing GA timestamp types:
* {{TIMESTAMP_NTZ(6)}} -> {{TimestampNTZType}}
* {{TIMESTAMP_LTZ(6)}} -> {{TimestampType}}
* {{TIMESTAMP(6) WITHOUT TIME ZONE}} -> {{TimestampNTZType}}
* {{TIMESTAMP(6) WITH LOCAL TIME ZONE}} -> {{TimestampType}}
* {{TIMESTAMP(6)}} -> the session default type ({{spark.sql.timestampType}})
h2. Why
This is a sub-task of SPARK-56822 (SPIP: Timestamps with nanosecond precision).
The SPIP introduces nanosecond-capable types {{TimestampNTZNanosType}} and
{{TimestampLTZNanosType}} for fractional precision {{p}} in [7, 9]. Microsecond
precision ({{p}} = 6) is exactly what the existing {{TimestampType}} and
{{TimestampNTZType}} already model. Today, however, {{TIMESTAMP_NTZ(6)}} /
{{TIMESTAMP_LTZ(6)}} are rejected with {{INVALID_TIMESTAMP_PRECISION}}, which is
surprising: an explicit {{(6)}} should be accepted and resolve to the equivalent
microsecond type, giving users a consistent precision model where {{p}} = 6
means microseconds.
h2. Current behavior
Parsing {{TIMESTAMP_NTZ(6)}} or {{TIMESTAMP_LTZ(6)}} throws:
{code}
[INVALID_TIMESTAMP_PRECISION] ... precision 6 ...
{code}
because {{TimestampNTZNanosType}} / {{TimestampLTZNanosType}} only allow
precision in [7, 9].
h2. Scope / where to change
Two parsing surfaces route the precision and both must map 6 to the
microsecond types:
# {{sql/api/.../catalyst/parser/DataTypeAstBuilder.scala}} - the SQL DDL parser.
Methods {{parseTimestampLtzNanosPrecision}} and
{{parseTimestampNtzNanosPrecision}} (these also back the bare and zoned
{{TIMESTAMP(p)}} cases).
# {{sql/api/.../sql/types/DataType.scala}} - {{nameToType}}, the
{{typeName}}/JSON-string parser ({{TIMESTAMP_LTZ_NANOS_TYPE}} and
{{TIMESTAMP_NTZ_NANOS_TYPE}} branches), so that string round-trips such as
{{timestamp_ntz(6)}} resolve consistently.
h2. Out of scope
* Precision {{p}} in [0, 5]. These imply rounding/truncation semantics that are
not modeled yet and are left for a separate follow-up. Keep this task to
{{p}} = 6 only.
* Any change to the nanosecond-capable types ({{p}} in [7, 9]).
h2. Open design decisions (please confirm with reviewers)
* *Preview-flag gating*: nanos parsing is gated behind
{{spark.sql.timestampNanosTypes.enabled}}. Since {{p}} = 6 resolves to a GA
type, it should arguably be accepted regardless of the flag. Decide and
document whether {{TIMESTAMP_*(6)}} requires the preview flag (preference:
do NOT require it).
h2. Acceptance criteria
* {{CatalystSqlParser.parseDataType("TIMESTAMP_NTZ(6)")}} returns
{{TimestampNTZType}}; {{"TIMESTAMP_LTZ(6)"}} returns {{TimestampType}}.
* The zoned spellings {{TIMESTAMP(6) WITHOUT TIME ZONE}} /
{{TIMESTAMP(6) WITH LOCAL TIME ZONE}} and bare {{TIMESTAMP(6)}} resolve as
listed under "What".
* {{DataType.fromDDL}} / {{typeName}} round-trip is consistent for these
spellings.
* Existing assertions in {{DataTypeParserSuite}} that expect
{{INVALID_TIMESTAMP_PRECISION}} for precision 6 are updated to expect the
microsecond types; {{p}} = 10 (and other out-of-[7,9] non-6 values) still
throw {{INVALID_TIMESTAMP_PRECISION}}.
h2. Tests
* {{sql/catalyst/.../parser/DataTypeParserSuite.scala}} - update the
"TIMESTAMP(6) WITHOUT TIME ZONE" / "timestamp(6)" cases (currently asserting
the error) and add positive cases for {{TIMESTAMP_NTZ(6)}} /
{{TIMESTAMP_LTZ(6)}}.
* Add a JSON/typeName round-trip case (e.g. {{timestamp_ntz(6)}}) in the
relevant {{DataTypeSuite}}.
h2. Notes for first-time contributors
This is a good-first-issue. Build/test a single module with SBT:
{code}
build/sbt 'sql/testOnly *DataTypeParserSuite'
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]