[
https://issues.apache.org/jira/browse/SPARK-57555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18090439#comment-18090439
]
Shrirang Mhalgi commented on SPARK-57555:
-----------------------------------------
Hi [~maxgekk]
I would like to take on the core JDBC path for this one -
{{{}getCatalystType{}}}, {{{}getJdbcType{}}}, value read/write via
{{{}LocalTime{}}}, back-compat gating, and precision-aware schema inference. I
will leave per-dialect overrides as follow-up sub-tasks as you suggested.
Let me know if it is available. Thank you.
> Support the TIME data type in the built-in JDBC data source
> -----------------------------------------------------------
>
> Key: SPARK-57555
> URL: https://issues.apache.org/jira/browse/SPARK-57555
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Priority: Major
>
> h2. What
> Add native {{TIME}} support to the built-in JDBC data source so that SQL
> {{TIME}} columns
> in external databases map to Spark's {{TimeType}} on read, and {{TimeType}}
> columns map to
> SQL {{TIME}} on write.
> h2. Why
> This is the core motivation of the SPIP (SPARK-51162, Q5): simplify migration
> from
> PostgreSQL, Snowflake, Redshift, Teradata, DB2, etc. Today JDBC cannot
> represent TIME:
> * On read, {{JdbcUtils.getCatalystType}} maps {{java.sql.Types.TIME}} to a
> *timestamp*
> type, not {{TimeType}}.
> * On write, {{JdbcUtils.getJdbcType}} has no {{case TimeType}}.
> * No dialect maps TIME.
> h2. Scope
> * {{JdbcUtils.getCatalystType}}: map {{java.sql.Types.TIME}} to {{TimeType}}
> (gated by the
> TIME feature flag and a JDBC option for back-compat).
> * {{JdbcUtils.getJdbcType}}: add {{case TimeType => JdbcType("TIME",
> java.sql.Types.TIME)}}.
> * Value conversion in {{JdbcUtils}} (read {{ResultSet.getTime}}/getObject ->
> nanos-of-day;
> write nanos-of-day -> {{java.sql.Time}}/{{LocalTime}}), preserving
> sub-second precision
> where the driver allows.
> * Per-dialect overrides where the SQL type name/precision differs
> (PostgreSQL, MySQL, MS SQL Server, Oracle, DB2, H2) in {{JdbcDialects}} and
> dialect
> classes. Consider splitting per-dialect work into sub-tasks.
> h2. Compatibility
> * A back-compat switch so existing pipelines that read TIME-as-timestamp do
> not break;
> document the new behavior.
> h2. Acceptance criteria
> * Read/write round-trip of TIME columns against the JDBC integration-test
> databases
> (Postgres/MySQL/MS SQL/Oracle/DB2) preserves values.
> * Schema inference reports {{TimeType}} for TIME columns.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]