[ 
https://issues.apache.org/jira/browse/SPARK-33888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267575#comment-17267575
 ] 

Stephen Kestle commented on SPARK-33888:
----------------------------------------

The change introduces a bug wrt Postgres loading of numeric array types:

[https://github.com/apache/spark/blob/f284218dae23bf91e72e221943188cdb85e13dac/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L43]
 wants to load the scale metadata for the ARRAY type, but 
 
[https://github.com/apache/spark/commit/0b647fe69cf201b4dcbc0f4dfc0eb504a523571d#diff-c3859e97335ead4b131263565c987d877bea0af3adbd6c5bf2d3716768d2e083L306]
 removes "scale" as default metadata in favour of specifying the exact types 
that support it.

I think I agree with the intent of this change (to be specific about what types 
support which metadata), but should be amended.

I've built with some changes that I've manually verified as working: 
[https://github.com/apache/spark/compare/master...skestle:SPARK-33888-postgres-fix]

 

> JDBC SQL TIME type represents incorrectly as TimestampType, it should be 
> physical Int in millis
> -----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-33888
>                 URL: https://issues.apache.org/jira/browse/SPARK-33888
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.3, 3.0.0, 3.0.1
>            Reporter: Duc Hoa Nguyen
>            Assignee: Apache Spark
>            Priority: Minor
>             Fix For: 3.2.0
>
>
> Currently, for JDBC, SQL TIME type represents incorrectly as Spark 
> TimestampType. This should be represent as physical int in millis Represents 
> a time of day, with no reference to a particular calendar, time zone or date, 
> with a precision of one millisecond. It stores the number of milliseconds 
> after midnight, 00:00:00.000.
> We encountered the issue of Avro logical type of `TimeMillis` not being 
> converted correctly to Spark `Timestamp` struct type using the 
> `SchemaConverters`, but it converts to regular `int` instead. Reproducible by 
> ingest data from MySQL table with a column of TIME type: Spark JDBC dataframe 
> will get the correct type (Timestamp), but enforcing our avro schema 
> (`{"type": "int"," logicalType": "time-millis"}`) externally will fail to 
> apply with the following exception:
> {{java.lang.RuntimeException: java.sql.Timestamp is not a valid external type 
> for schema of int}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to