[PR] [SPARK-57164][SQL][TESTS] Add parser test coverage for nanosecond-capable timestamp types across data-type string entry points [spark]

via GitHub Mon, 15 Jun 2026 03:06:55 -0700


MaxGekk opened a new pull request, #56514:
URL: https://github.com/apache/spark/pull/56514


   ### What changes were proposed in this pull request?
   Test-only changes (sub-task of 
[SPARK-56822](https://issues.apache.org/jira/browse/SPARK-56822), SPIP: 
Timestamps with nanosecond precision) adding focused coverage that the 
nanosecond-capable timestamp spellings (`TIMESTAMP_NTZ(p)`, `TIMESTAMP_LTZ(p)`, 
and the `TIMESTAMP(p) WITH[OUT] [LOCAL] TIME ZONE` aliases, `p` in `[7, 9]`) 
parse consistently across the public string-to-DataType entry points. All new 
assertions are gated by `spark.sql.timestampNanosTypes.enabled`.
   
   Coverage added:
   - Catalyst `DataTypeParserSuite`: `DataType.fromDDL`, `StructType.fromDDL`, 
`StructType.add(name, String)`, and `DataType.parseTypeWithFallback` (DDL path 
+ JSON fallback, the guard against the ANTLR and JSON parser families drifting 
apart).
   - Catalyst `ExpressionParserSuite`: `CAST` / `TRY_CAST` to nanos types.
   - Catalyst `DDLParserSuite`: `CREATE TABLE`, `ALTER TABLE ADD COLUMNS`, 
`ALTER COLUMN ... TYPE`, and a column `DEFAULT` declared with a nanos type.
   - Catalyst `DataTypeSuite`: `typeName` re-parse leg of the Family B (JSON) 
round-trip.
   - `ColumnExpressionSuite`: `Column.cast(String)` / `try_cast(String)` and 
`sessionState.sqlParser.parseDataType(String)`.
   - `DataFrameReaderWriterSuite` / `DataStreamReaderWriterSuite`: 
`schema(String)`.
   - `JsonFunctionsSuite` / `CsvFunctionsSuite` / `XmlFunctionsSuite`: 
`from_json` / `from_csv` / `from_xml` with nanos DDL schema strings, asserting 
the string resolves to the nanos type but the datasource rejects it at 
execution (JSON/CSV: `UNSUPPORTED_DATATYPE`; XML: `MALFORMED_RECORD_IN_PARSING` 
in FAILFAST mode).
   
   ### Why are the changes needed?
   Spark parses data-type strings through two independent parser families 
(ANTLR `DataTypeAstBuilder` and the hand-maintained JSON `nameToType` in 
`DataType.scala`), reached through many distinct public surfaces. The nanos 
parsing was previously exercised mainly via `CatalystSqlParser.parseDataType`; 
the other public entry points had no explicit assertions, so a regression on 
any one of them (or drift between the two families) could go unnoticed.
   
   ### Does this PR introduce _any_ user-facing change?
   No. Test-only.
   
   ### How was this patch tested?
   Ran the affected suites locally; all pass:
   - `build/sbt 'catalyst/testOnly *DataTypeParserSuite *DataTypeSuite 
*ExpressionParserSuite *DDLParserSuite'`
   - `build/sbt 'sql/testOnly *ColumnExpressionSuite 
*DataFrameReaderWriterSuite *DataStreamReaderWriterSuite *JsonFunctionsSuite 
*CsvFunctionsSuite *XmlFunctionsSuite'`
   
   ### Was this patch authored or co-authored using generative AI tooling?
   Generated-by: Cursor (Claude Opus 4.8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-57164][SQL][TESTS] Add parser test coverage for nanosecond-capable timestamp types across data-type string entry points [spark]

Reply via email to