Max Gekk created SPARK-57207:
--------------------------------

             Summary: Register nanosecond timestamp types in the Types 
Framework via TypeOps overrides
                 Key: SPARK-57207
                 URL: https://issues.apache.org/jira/browse/SPARK-57207
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk
            Assignee: Max Gekk


### Summary

Register TimestampNTZNanosType(p) and TimestampLTZNanosType(p) (p in [7, 9]) in 
the
Spark SQL Types Framework (SPARK-53504) by adding TypeOps (server-side, 
catalyst) and
TypeApiOps (client-side, sql/api) implementations. The logical types and the 
physical row
layer already exist (SPARK-56876, SPARK-56981); this issue centralizes the 
wiring behind
TypeOps when spark.sql.types.framework.enabled is true.

This is split out of SPARK-57101 / PR #56199 so that the timestamp-nano type 
registration is
reviewed independently of the abstract Types Framework method additions. It 
deliberately only
*overrides existing* TypeOps / TypeApiOps methods; no new framework methods are 
introduced.

### What changes

Add TypeOps implementations (sql/catalyst):
- TimestampNanosTypeOps shared trait with TimestampNTZNanosTypeOps / 
TimestampLTZNanosTypeOps,
  following the TimeTypeOps pattern.
- Register both in TypeOps.apply() alongside TimeType.
- Overridden methods (all already on TypeOps): getPhysicalType, getJavaClass, 
getRowWriter,
  getDefaultLiteral, getJavaLiteral, getMutableValue, toCatalystImpl, toScala, 
toScalaImpl.

Add TypeApiOps stubs (sql/api):
- TimestampNanosTypeApiOps base with TimestampNTZNanosTypeApiOps / 
TimestampLTZNanosTypeApiOps,
  registered in TypeApiOps.apply().
- format / toSQLValue: interim implementation (TimestampNanosVal.toString with 
NTZ/LTZ prefix)
  until dedicated fractional-second formatters land.
- getEncoder: reports the type as unsupported 
(UNSUPPORTED_DATA_TYPE_FOR_ENCODER), matching the
  legacy RowEncoder fallback; encoders are out of scope (SPARK-57033).

Mutable holder:
- Add MutableTimestampNanos to SpecificInternalRow to avoid the MutableAny 
fallback.

Feature flag:
- All registration is gated by spark.sql.types.framework.enabled (same as 
TimeType).
- When the flag is false, behavior remains identical to the current legacy 
paths.

### Integration points (automatic when TypeOps returns Some)

These call sites already delegate to TypeOps(dt).map(...).getOrElse(legacy); no 
per-call-site
edits are required beyond registration: PhysicalDataType.apply, Literal.default,
InternalRow.getWriter / getAccessor, CodeGenerator Java class for codegen, and
SpecificInternalRow mutable column values.

### Tests

New TimestampNanosTypeOpsSuite, for p in {7, 8, 9} over NTZ and LTZ:
- TypeOps / TypeApiOps are registered when the framework is enabled.
- PhysicalDataType, Literal.default value, and codegen Java class are correct.
- InternalRow and SpecificInternalRow set/read roundtrips.
- SpecificInternalRow uses the dedicated MutableTimestampNanos holder.
- getEncoder reports UNSUPPORTED_DATA_TYPE_FOR_ENCODER.
- toSQLValue uses the NTZ/LTZ literal prefix.
- Framework-disabled fallback produces identical results.

### Out of scope

- New abstract Types Framework methods for codegen (kept in SPARK-57101 / PR 
#56199).
- CatalystTypeConverters / java.time roundtrip (SPARK-57033), encoders, Connect 
proto, Arrow,
  PySpark conversion, cast/Parquet/ColumnVector, and physical 
ordering/compare/hash.

### Depends on

- SPARK-56981 (physical row layer and TimestampNanosVal)

### References

- SPARK-56822 - parent SPIP
- SPARK-53504 - Types Framework
- Precedent: org.apache.spark.sql.catalyst.types.ops.TimeTypeOps



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to