[
https://issues.apache.org/jira/browse/SPARK-46612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kent Yao updated SPARK-46612:
-----------------------------
Parent: SPARK-47361
Issue Type: Sub-task (was: Bug)
> Clickhouse's JDBC throws `java.lang.IllegalArgumentException: Unknown data
> type: string` when write array string with Apache Spark scala
> ----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-46612
> URL: https://issues.apache.org/jira/browse/SPARK-46612
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 3.5.0
> Reporter: Nguyen Phan Huy
> Assignee: Nguyen Phan Huy
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Issue is also reported on Clickhouse's github:
> [https://github.com/ClickHouse/clickhouse-java/issues/1505]
> h3. Bug description
> When using Scala spark to write an array of string to Clickhouse, the driver
> throws {{java.lang.IllegalArgumentException: Unknown data type: string}}
> exception.
> Exception is thrown by:
> [https://github.com/ClickHouse/clickhouse-java/blob/aa3870eadb1a2d3675fd5119714c85851800f076/clickhouse-data/src/main/java/com/clickhouse/data/ClickHouseDataType.java#L238]
> This was caused by Spark JDBC Utils tried to cast the type to lower case
> ({{{}String{}}} -> {{{}string{}}}).
> [https://github.com/apache/spark/blob/6b931530d75cb4f00236f9c6283de8ef450963ad/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L639]
> h3. Steps to reproduce
> # Create Clickhouse table with String Array field
> ([https://clickhouse.com/]).
> # Write data to the table with scala Spark, via Clickhouse's JDBC
> ([https://github.com/ClickHouse/clickhouse-java)]
> {code:java}
> // code extraction, will need to setup a Scala Spark job with clickhouse
> jdbc
> val clickHouseSchema = StructType(
> Seq(
> StructField("str_array", ArrayType(StringType))
> )
> )
> val data = Seq(
> Row(
> Seq("a", "b")
> )
> )
> val clickHouseDf = spark.createDataFrame(sc.parallelize(data),
> clickHouseSchema)
>
> val props = new Properties
> props.put("user", "default")
> clickHouseDf.write
> .mode(SaveMode.Append)
> .option("driver", com.clickhouse.jdbc.ClickHouseDriver)
> .jdbc("jdbc:clickhouse://localhost:8123/foo", table = "bar", props)
> {code}
> h2. Fix
> - [https://github.com/apache/spark/pull/44459]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]