[PR] [SPARK-57101][SQL] Register nanosecond timestamp types in the Types Framework (server-side) [spark]

via GitHub Fri, 29 May 2026 02:36:19 -0700


MaxGekk opened a new pull request, #56199:
URL: https://github.com/apache/spark/pull/56199


   ### What changes were proposed in this pull request?
   
   This PR registers `TimestampNTZNanosType(p)` and `TimestampLTZNanosType(p)` 
(p in [7, 9]) in the Spark SQL Types Framework (SPARK-53504) for server-side 
(catalyst) operations, following the `TimeTypeOps` / `TimeTypeApiOps` reference 
implementation.
   
   Concretely:
   - Adds `TimestampNanosTypeApiOps` (sql/api) with concrete 
`TimestampNTZNanosTypeApiOps` / `TimestampLTZNanosTypeApiOps`. Implements 
interim `format` / `toSQLValue` over `TimestampNanosVal`; `getEncoder` throws 
`UNSUPPORTED_DATA_TYPE_FOR_ENCODER` to preserve current `RowEncoder` behavior 
(encoders are out of scope, see SPARK-57033).
   - Adds `TimestampNanosTypeOps` (sql/catalyst) with concrete 
`TimestampNTZNanosTypeOps` / `TimestampLTZNanosTypeOps`. Provides the physical 
type (`PhysicalTimestampNTZNanosType` / `PhysicalTimestampLTZNanosType`), 
`getJavaClass = classOf[TimestampNanosVal]`, `getDefaultLiteral = 
Literal.create(TimestampNanosVal.ZERO, t)`, the row writer 
(`setTimestampNTZNanos` / `setTimestampLTZNanos`), `getMutableValue`, and 
identity external conversions (java.time conversion is out of scope, 
SPARK-57033).
   - Adds a dedicated `MutableTimestampNanos` holder in `SpecificInternalRow` 
so nanos columns avoid the `MutableAny` fallback.
   - Registers both types at the single registration points: `TypeOps.apply()` 
(catalyst) and `TypeApiOps.apply()` (sql/api). All integration points 
(`PhysicalDataType.apply`, `Literal.default`, `InternalRow.getWriter`, 
`CodeGenerator.javaClass`, `EncoderUtils.dataTypeJavaClass`, 
`SpecificInternalRow`, `CatalystTypeConverters`) already delegate to these 
factories, so no per-call-site edits are required.
   
   Class hierarchy (mirrors `TimeTypeOps extends TimeTypeApiOps with TypeOps`):
   
   ```
   TimestampNTZNanosTypeOps extends TimestampNTZNanosTypeApiOps with 
TimestampNanosTypeOps (-> TypeOps)
   TimestampLTZNanosTypeOps extends TimestampLTZNanosTypeApiOps with 
TimestampNanosTypeOps (-> TypeOps)
   ```
   
   All registration is gated by `spark.sql.types.framework.enabled`. When the 
flag is false, behavior is identical to the existing legacy paths.
   
   Out of scope (follow-ups): java.time conversion and `CatalystTypeConverters` 
roundtrip (SPARK-57033), Dataset encoders, Connect proto, Arrow mapping, 
PySpark, cast matrix, Parquet, and physical ordering/compare/hash.
   
   ### Why are the changes needed?
   
   Part of SPARK-56822 (Timestamps with nanosecond precision). The logical 
types and physical row layer already exist (SPARK-56876, SPARK-56981), but the 
nanos types were wired only through legacy dispatch in `PhysicalDataType`, 
`Literal`, `InternalRow`, and codegen. This change centralizes that wiring 
behind the Types Framework, consistent with how `TimeType` is handled, reducing 
scattered pattern matching.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. The types are internal/unstable and the framework path is gated by an 
internal feature flag; with the flag off the behavior is unchanged.
   
   ### How was this patch tested?
   
   - Added `TimestampNanosTypeOpsSuite` covering, for p in {7, 8, 9} and both 
NTZ and LTZ: framework registration, physical type, default literal, codegen 
Java class, `GenericInternalRow` / `SpecificInternalRow` roundtrips, the 
dedicated `MutableTimestampNanos` holder, `getEncoder` parity, SQL-literal 
prefixes, and framework-off equivalence.
   - Ran related catalyst suites (all passing): `TimestampNanosTypeOpsSuite`, 
`TimestampNanosRowSuite`, `TimestampNanosRowValuesSuite`, 
`LiteralExpressionSuite`, `CatalystTypeConvertersSuite`, 
`GenerateUnsafeProjectionSuite`, `DataTypeSuite`, `TypeUtilsSuite`, 
`DataTypeParserSuite`, `RowEncoderSuite`, `ExpressionEncoderSuite`, 
`RowJsonSuite`, `ToPrettyStringSuite`.
   - `dev/scalastyle` passes.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Cursor (Claude Opus 4.8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-57101][SQL] Register nanosecond timestamp types in the Types Framework (server-side) [spark]

Reply via email to