[
https://issues.apache.org/jira/browse/SPARK-43267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722029#comment-17722029
]
Jia Fan commented on SPARK-43267:
---------------------------------
https://github.com/apache/spark/pull/40953
> Support creating data frame from a Postgres table that contains user-defined
> array column
> -----------------------------------------------------------------------------------------
>
> Key: SPARK-43267
> URL: https://issues.apache.org/jira/browse/SPARK-43267
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 2.4.0, 3.3.2
> Reporter: Sifan Huang
> Priority: Blocker
>
> Spark SQL now doesn’t support creating data frame from a Postgres table that
> contains user-defined array column. However, it used to allow such type
> before the Postgres JDBC commit
> (https://github.com/pgjdbc/pgjdbc/commit/375cb3795c3330f9434cee9353f0791b86125914).
> The previous behavior was to handle user-defined array column as String.
> Given:
> * Postgres table with user-defined array column
> * Function: DataFrameReader.jdbc -
> https://spark.apache.org/docs/2.4.0/api/java/org/apache/spark/sql/DataFrameReader.html#jdbc-java.lang.String-java.lang.String-java.util.Properties-
> Results:
> * Exception “java.sql.SQLException: Unsupported type ARRAY” is thrown
> Expectation after the change:
> * Function call succeeds
> * User-defined array is converted as a string in Spark DataFrame
> Suggested fix:
> * Update “getCatalystType” function in “PostgresDialect” as
> **
> {code:java}
> val catalystType = toCatalystType(typeName.drop(1), size,
> scale).map(ArrayType(_))
> if (catalystType.isEmpty) Some(StringType) else catalystType{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]