[
https://issues.apache.org/jira/browse/SPARK-15987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15341294#comment-15341294
]
Sergey Bahchissaraitsev commented on SPARK-15987:
-------------------------------------------------
Casting could be a work around, I tried creating a view with casting the type
citext into regular varchar and it worked. Although, I don't think that it
should be they way to go.
The 1111 type in the error message indicates it's being treated as OTHER and
spark automatically throws an exception in that case:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L89
As this might be ok for default behavior, there probably could be an option to
specify how to treat the OTHER type.
In this case, the user could specify StringType as Takeshi suggested, but in
other cases with other postgres (or maybe even not postgres) extensions, we
could as well use BinaryType or any other type the user of the application sees
fit.
Could this work?
Thanks.
> PostgreSQL CITEXT type JDBC support
> -----------------------------------
>
> Key: SPARK-15987
> URL: https://issues.apache.org/jira/browse/SPARK-15987
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 1.6.1
> Environment: Ubuntu 14.04
> PostgreSQL 9.3.9
> Reporter: Sergey Bahchissaraitsev
> Labels: dataframe, jdbc, postgresql
>
> When trying to use spark data frame on a table with CITEXT type you get the
> following error:
> Exception in thread "main" java.sql.SQLException: Unsupported type 1111
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$getCatalystType(JDBCRDD.scala:102)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$1.apply(JDBCRDD.scala:141)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$1.apply(JDBCRDD.scala:141)
> at scala.Option.getOrElse(Option.scala:120)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:140)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:222)
> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:208)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]