MaxGekk opened a new pull request, #56266: URL: https://github.com/apache/spark/pull/56266
### What changes were proposed in this pull request? This PR registers `TimestampNTZNanosType(p)` and `TimestampLTZNanosType(p)` (p in [7, 9]) in the Spark SQL Types Framework (SPARK-53504), gated by `spark.sql.types.framework.enabled`. It is split out of PR #56199 (SPARK-57101) per [review feedback](https://github.com/apache/spark/pull/56199#discussion_r3333031096), so that the timestamp-nano type registration is reviewed independently of the abstract Types Framework method additions. This PR deliberately only **overrides existing** `TypeOps` / `TypeApiOps` methods; it introduces **no new framework methods**. Concretely: - Add `TimestampNanosTypeOps` (catalyst) with `TimestampNTZNanosTypeOps` / `TimestampLTZNanosTypeOps`, registered in `TypeOps.apply()` next to `TimeType`. Overrides: `getPhysicalType`, `getJavaClass`, `getRowWriter`, `getDefaultLiteral`, `getJavaLiteral`, `getMutableValue`, `toCatalystImpl`, `toScala`, `toScalaImpl`. - Add `TimestampNanosTypeApiOps` (sql/api) with NTZ/LTZ subclasses, registered in `TypeApiOps.apply()`. `format` / `toSQLValue` are interim (based on `TimestampNanosVal.toString` with a `TIMESTAMP_NTZ` / `TIMESTAMP_LTZ` prefix); `getEncoder` reports the type as unsupported, matching the legacy `RowEncoder` fallback. - Add `MutableTimestampNanos` to `SpecificInternalRow` to avoid the `MutableAny` fallback. The existing call sites (`PhysicalDataType.apply`, `Literal.default`, `InternalRow.getWriter`/`getAccessor`, codegen Java class selection, `SpecificInternalRow` mutable columns) already delegate to `TypeOps(dt).map(...).getOrElse(legacy)`, so no per-call-site edits are needed beyond registration. Out of scope (follow-ups): encoders and `java.time` roundtrip (SPARK-57033), Connect proto, Arrow, PySpark conversion, cast/Parquet/ColumnVector, and physical ordering/compare/hash. ### Why are the changes needed? The logical nanosecond timestamp types (SPARK-56876) and the physical row layer (SPARK-56981) already exist, but these types are currently wired only through scattered legacy dispatch. Registering them in the Types Framework centralizes the type-specific operations behind `TypeOps`, consistent with `TimeType`, and is a prerequisite for the remaining nanosecond timestamp work. ### Does this PR introduce _any_ user-facing change? No. All registration is gated by the internal flag `spark.sql.types.framework.enabled`. When the flag is `false`, behavior is identical to the existing legacy paths. ### How was this patch tested? Added `TimestampNanosTypeOpsSuite`, covering NTZ and LTZ for p in {7, 8, 9}: - `TypeOps` / `TypeApiOps` registration when the framework is enabled. - `PhysicalDataType`, `Literal.default` value, and codegen Java class. - `InternalRow` and `SpecificInternalRow` set/read roundtrips, including the dedicated `MutableTimestampNanos` holder. - `getEncoder` reports `UNSUPPORTED_DATA_TYPE_FOR_ENCODER`. - `toSQLValue` uses the NTZ/LTZ literal prefix. - Framework-disabled fallback produces identical results. ``` build/sbt 'catalyst/testOnly org.apache.spark.sql.catalyst.types.ops.TimestampNanosTypeOpsSuite' ``` All 7 tests pass. `catalyst`/`sql-api` scalastyle are clean. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
