(spark) branch master updated: [SPARK-46612][SQL] Do not convert array type string retrieved from jdbc driver

yao Tue, 16 Jan 2024 18:58:35 -0800

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 931eb938c70d [SPARK-46612][SQL] Do not convert array type string 
retrieved from jdbc driver
931eb938c70d is described below

commit 931eb938c70d6c6707e93f164255ac3c7dba63b0
Author: Kent Yao <[email protected]>
AuthorDate: Wed Jan 17 10:58:14 2024 +0800

    [SPARK-46612][SQL] Do not convert array type string retrieved from jdbc 
driver
    
    Hi, thanks for checking the PR. This is a small bug fix to make Scala Spark 
works with Clickhouse's array type. Let me know if this could cause problem on 
other DB types.
    
    (Please help to trigger CI if possible. I failed to make the build pipeline 
run - any help is appreciated)
    
    ### Why are the changes needed?
    
    The PR is to fix issue describe at: 
https://github.com/ClickHouse/clickhouse-java/issues/1505
    
    When using spark to write an array of string to Clickhouse, the Clickhouse 
JDBC driver throws `java.lang.IllegalArgumentException: Unknown data type: 
string` exception.
    
    The exception was due to Spark JDBC utils passing an invalid type value 
`string` (should be `String`). The original type values retrieved from 
Clickhouse JDBC is correct, but Spark JDBC utils attempts to convert to type 
string to lower case:
    
    
https://github.com/apache/spark/blob/6b931530d75cb4f00236f9c6283de8ef450963ad/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L639
    
    ### What changes were proposed in this pull request?
    
    - Remove the lowercase cast. The string value retrieved from JDBC driver 
implementation should be passed as-is, Spark shouldn't try to modify the value.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    - Follow the reproduce step at: 
https://github.com/ClickHouse/clickhouse-java/issues/1505
      - Create ClickHouse table with array string type
      - Write scala Spark job to write to clickhouse
    - Verify the change fixes the issue
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #44459 from phanhuyn/do-not-lower-case-jdbc-array-type.
    
    Lead-authored-by: Kent Yao <[email protected]>
    Co-authored-by: Nguyen Phan Huy <[email protected]>
    Signed-off-by: Kent Yao <[email protected]>
---
 .../org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala   | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
index 467c489a50fd..51daea76abc5 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
@@ -20,7 +20,6 @@ package org.apache.spark.sql.execution.datasources.jdbc
 import java.sql.{Connection, JDBCType, PreparedStatement, ResultSet, 
ResultSetMetaData, SQLException}
 import java.time.{Instant, LocalDate}
 import java.util
-import java.util.Locale
 import java.util.concurrent.TimeUnit
 
 import scala.collection.mutable.ArrayBuffer
@@ -635,8 +634,7 @@ object JdbcUtils extends Logging with SQLConfHelper {
 
     case ArrayType(et, _) =>
       // remove type length parameters from end of type name
-      val typeName = getJdbcType(et, dialect).databaseTypeDefinition
-        .toLowerCase(Locale.ROOT).split("\\(")(0)
+      val typeName = getJdbcType(et, 
dialect).databaseTypeDefinition.split("\\(")(0)
       (stmt: PreparedStatement, row: Row, pos: Int) =>
         val array = conn.createArrayOf(
           typeName,


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-46612][SQL] Do not convert array type string retrieved from jdbc driver

Reply via email to