[
https://issues.apache.org/jira/browse/SPARK-57207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk resolved SPARK-57207.
------------------------------
Fix Version/s: 4.3.0
Resolution: Fixed
Issue resolved by pull request 56266
[https://github.com/apache/spark/pull/56266]
> Register nanosecond timestamp types in the Types Framework via TypeOps
> overrides
> --------------------------------------------------------------------------------
>
> Key: SPARK-57207
> URL: https://issues.apache.org/jira/browse/SPARK-57207
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Assignee: Max Gekk
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.3.0
>
>
> ### Summary
> Register TimestampNTZNanosType(p) and TimestampLTZNanosType(p) (p in [7, 9])
> in the
> Spark SQL Types Framework (SPARK-53504) by adding TypeOps (server-side,
> catalyst) and
> TypeApiOps (client-side, sql/api) implementations. The logical types and the
> physical row
> layer already exist (SPARK-56876, SPARK-56981); this issue centralizes the
> wiring behind
> TypeOps when spark.sql.types.framework.enabled is true.
> This is split out of SPARK-57101 / PR #56199 so that the timestamp-nano type
> registration is
> reviewed independently of the abstract Types Framework method additions. It
> deliberately only
> *overrides existing* TypeOps / TypeApiOps methods; no new framework methods
> are introduced.
> ### What changes
> Add TypeOps implementations (sql/catalyst):
> - TimestampNanosTypeOps shared trait with TimestampNTZNanosTypeOps /
> TimestampLTZNanosTypeOps,
> following the TimeTypeOps pattern.
> - Register both in TypeOps.apply() alongside TimeType.
> - Overridden methods (all already on TypeOps): getPhysicalType, getJavaClass,
> getRowWriter,
> getDefaultLiteral, getJavaLiteral, getMutableValue, toCatalystImpl,
> toScala, toScalaImpl.
> Add TypeApiOps stubs (sql/api):
> - TimestampNanosTypeApiOps base with TimestampNTZNanosTypeApiOps /
> TimestampLTZNanosTypeApiOps,
> registered in TypeApiOps.apply().
> - format / toSQLValue: interim implementation (TimestampNanosVal.toString
> with NTZ/LTZ prefix)
> until dedicated fractional-second formatters land.
> - getEncoder: reports the type as unsupported
> (UNSUPPORTED_DATA_TYPE_FOR_ENCODER), matching the
> legacy RowEncoder fallback; encoders are out of scope (SPARK-57033).
> Mutable holder:
> - Add MutableTimestampNanos to SpecificInternalRow to avoid the MutableAny
> fallback.
> Feature flag:
> - All registration is gated by spark.sql.types.framework.enabled (same as
> TimeType).
> - When the flag is false, behavior remains identical to the current legacy
> paths.
> ### Integration points (automatic when TypeOps returns Some)
> These call sites already delegate to TypeOps(dt).map(...).getOrElse(legacy);
> no per-call-site
> edits are required beyond registration: PhysicalDataType.apply,
> Literal.default,
> InternalRow.getWriter / getAccessor, CodeGenerator Java class for codegen, and
> SpecificInternalRow mutable column values.
> ### Tests
> New TimestampNanosTypeOpsSuite, for p in {7, 8, 9} over NTZ and LTZ:
> - TypeOps / TypeApiOps are registered when the framework is enabled.
> - PhysicalDataType, Literal.default value, and codegen Java class are correct.
> - InternalRow and SpecificInternalRow set/read roundtrips.
> - SpecificInternalRow uses the dedicated MutableTimestampNanos holder.
> - getEncoder reports UNSUPPORTED_DATA_TYPE_FOR_ENCODER.
> - toSQLValue uses the NTZ/LTZ literal prefix.
> - Framework-disabled fallback produces identical results.
> ### Out of scope
> - New abstract Types Framework methods for codegen (kept in SPARK-57101 / PR
> #56199).
> - CatalystTypeConverters / java.time roundtrip (SPARK-57033), encoders,
> Connect proto, Arrow,
> PySpark conversion, cast/Parquet/ColumnVector, and physical
> ordering/compare/hash.
> ### Depends on
> - SPARK-56981 (physical row layer and TimestampNanosVal)
> ### References
> - SPARK-56822 - parent SPIP
> - SPARK-53504 - Types Framework
> - Precedent: org.apache.spark.sql.catalyst.types.ops.TimeTypeOps
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]