GitHub user jmchung opened a pull request:
https://github.com/apache/spark/pull/19567
[SPARK-22291] Postgresql UUID[] to Cassandra: Conversion Error
## What changes were proposed in this pull request?
This PR fixes the conversion error when reads data from a PostgreSQL table
that contains columns of `UUID[]` data type.
For example, create a table with the UUID[] data type, and insert the test
data.
```SQL
CREATE TABLE users
(
id smallint NOT NULL,
name character varying(50),
user_ids uuid[],
PRIMARY KEY (id)
)
INSERT INTO users ("id", "name","user_ids")
VALUES (1, 'foo', ARRAY
['7be8aaf8-650e-4dbb-8186-0a749840ecf2'
,'205f9bfc-018c-4452-a605-609c0cfad228']::UUID[]
)
```
Then it will throw the following exceptions when trying to load the data.
```
java.lang.ClassCastException: [Ljava.util.UUID; cannot be cast to
[Ljava.lang.String;
at
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$14.apply(JdbcUtils.scala:459)
at
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$14.apply(JdbcUtils.scala:458)
...
```
## How was this patch tested?
Existing tests.
I try to imitate the tests with above case in `JDBCSuite`, but the `ARRAY`
is unsupported type now. Therefore I took the above example in my Postgres and
verified by the following code.
```scala
val opts = Map(
"url" ->
"jdbc:postgresql://localhost:5432/postgres?user=postgres&password=postgres",
"dbtable" -> "users")
val df = spark.read.format("jdbc").options(opts).load()
df.show(truncate = false)
+---+----+----------------------------------------------------------------------------+
|id |name|user_ids
|
+---+----+----------------------------------------------------------------------------+
|1 |foo |[7be8aaf8-650e-4dbb-8186-0a749840ecf2,
205f9bfc-018c-4452-a605-609c0cfad228]|
+---+----+----------------------------------------------------------------------------+
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jmchung/spark SPARK-22291
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19567.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19567
----
commit d84b1bb89e3be33931531345fb23cadd8fe6868f
Author: Jen-Ming Chung <[email protected]>
Date: 2017-10-24T18:24:43Z
[SPARK-22291] Postgresql UUID[] to Cassandra: Conversion Error
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]