[ 
https://issues.apache.org/jira/browse/SPARK-46612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nguyen Phan Huy updated SPARK-46612:
------------------------------------
    Description: 
Issue is also reported on Clickhouse's github: 
[https://github.com/ClickHouse/clickhouse-java/issues/1505] 
h3. Bug description



When using Scala spark to write an array of string to Clickhouse, the driver 
throws {{java.lang.IllegalArgumentException: Unknown data type: string}} 
exception.

Exception is thrown by: 
[https://github.com/ClickHouse/clickhouse-java/blob/aa3870eadb1a2d3675fd5119714c85851800f076/clickhouse-data/src/main/java/com/clickhouse/data/ClickHouseDataType.java#L238]

This was caused by Spark JDBC Utils tried to cast the type to lower case 
({{{}String{}}} -> {{{}string{}}}).
[https://github.com/apache/spark/blob/6b931530d75cb4f00236f9c6283de8ef450963ad/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L639]
h3. Steps to reproduce
 # Create Clickhouse table with String Array field ([https://clickhouse.com/]).
 # Write data to the table with scala Spark, via Clickhouse's JDBC 
([https://github.com/ClickHouse/clickhouse-java)] 
{code:java}
   // code extraction, will need to setup a Scala Spark job with clickhouse jdbc
    val clickHouseSchema = StructType(
      Seq(
        StructField("str_array", ArrayType(StringType))
      )
    )
    val data = Seq(
      Row(
        Seq("a", "b")
      )
    )

    val clickHouseDf = spark.createDataFrame(sc.parallelize(data), 
clickHouseSchema)
   
    val props = new Properties
    props.put("user", "default")
    clickHouseDf.write
      .mode(SaveMode.Append)
      .option("driver", com.clickhouse.jdbc.ClickHouseDriver)
      .jdbc("jdbc:clickhouse://localhost:8123/foo", table = "bar", props) {code}
h2. Fix

 - [https://github.com/apache/spark/pull/44459] 

  was:
h3. Bug description

 

When using Scala spark to write an array of string to Clickhouse, the driver 
throws {{java.lang.IllegalArgumentException: Unknown data type: string}} 
exception.

Exception is thrown by: 
[https://github.com/ClickHouse/clickhouse-java/blob/aa3870eadb1a2d3675fd5119714c85851800f076/clickhouse-data/src/main/java/com/clickhouse/data/ClickHouseDataType.java#L238]

This was caused by Spark JDBC Utils tried to cast the type to lower case 
({{{}String{}}} -> {{{}string{}}}).
[https://github.com/apache/spark/blob/6b931530d75cb4f00236f9c6283de8ef450963ad/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L639]
h3. Steps to reproduce

 # Create Clickhouse table with String Array field (https://clickhouse.com/).
 # Write data to the table with scala Spark, via Clickhouse's JDBC 
([https://github.com/ClickHouse/clickhouse-java)] 
{code:java}
   // code extraction, will need to setup a Scala Spark job with clickhouse jdbc
    val clickHouseSchema = StructType(
      Seq(
        StructField("str_array", ArrayType(StringType))
      )
    )
    val data = Seq(
      Row(
        Seq("a", "b")
      )
    )

    val clickHouseDf = spark.createDataFrame(sc.parallelize(data), 
clickHouseSchema)
   
    val props = new Properties
    props.put("user", "default")
    clickHouseDf.write
      .mode(SaveMode.Append)
      .option("driver", com.clickhouse.jdbc.ClickHouseDriver)
      .jdbc("jdbc:clickhouse://localhost:8123/foo", table = "bar", props) {code}
h2. Fix
- [https://github.com/apache/spark/pull/44459] 


> Clickhouse's JDBC throws `java.lang.IllegalArgumentException: Unknown data 
> type: string` when write array string with Apache Spark scala
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-46612
>                 URL: https://issues.apache.org/jira/browse/SPARK-46612
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.5.0
>            Reporter: Nguyen Phan Huy
>            Priority: Major
>
> Issue is also reported on Clickhouse's github: 
> [https://github.com/ClickHouse/clickhouse-java/issues/1505] 
> h3. Bug description
> When using Scala spark to write an array of string to Clickhouse, the driver 
> throws {{java.lang.IllegalArgumentException: Unknown data type: string}} 
> exception.
> Exception is thrown by: 
> [https://github.com/ClickHouse/clickhouse-java/blob/aa3870eadb1a2d3675fd5119714c85851800f076/clickhouse-data/src/main/java/com/clickhouse/data/ClickHouseDataType.java#L238]
> This was caused by Spark JDBC Utils tried to cast the type to lower case 
> ({{{}String{}}} -> {{{}string{}}}).
> [https://github.com/apache/spark/blob/6b931530d75cb4f00236f9c6283de8ef450963ad/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L639]
> h3. Steps to reproduce
>  # Create Clickhouse table with String Array field 
> ([https://clickhouse.com/]).
>  # Write data to the table with scala Spark, via Clickhouse's JDBC 
> ([https://github.com/ClickHouse/clickhouse-java)] 
> {code:java}
>    // code extraction, will need to setup a Scala Spark job with clickhouse 
> jdbc
>     val clickHouseSchema = StructType(
>       Seq(
>         StructField("str_array", ArrayType(StringType))
>       )
>     )
>     val data = Seq(
>       Row(
>         Seq("a", "b")
>       )
>     )
>     val clickHouseDf = spark.createDataFrame(sc.parallelize(data), 
> clickHouseSchema)
>    
>     val props = new Properties
>     props.put("user", "default")
>     clickHouseDf.write
>       .mode(SaveMode.Append)
>       .option("driver", com.clickhouse.jdbc.ClickHouseDriver)
>       .jdbc("jdbc:clickhouse://localhost:8123/foo", table = "bar", props) 
> {code}
> h2. Fix
>  - [https://github.com/apache/spark/pull/44459] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to