sunxiaoguang commented on code in PR #49453:
URL: https://github.com/apache/spark/pull/49453#discussion_r1936895094


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/MySQLIntegrationSuite.scala:
##########
@@ -241,6 +241,84 @@ class MySQLIntegrationSuite extends 
DockerJDBCIntegrationV2Suite with V2JDBCTest
     assert(rows10(0).getString(0) === "amy")
     assert(rows10(1).getString(0) === "alex")
   }
+
+  // MySQL Connector/J uses collation 'utf8mb4_0900_ai_ci' as collation for 
connection.
+  // The MySQL server 9.1.0 uses collation 'utf8mb4_0900_ai_ci' for database 
by default.
+  // This method uses string colume directly as the result of cast has the 
same collation.
+  def testCastStringTarget(stringLiteral: String, stringCol: String): String = 
stringCol
+
+  test("SPARK-50793: MySQL JDBC Connector failed to cast some types") {
+    val tableName = catalogName + ".test_cast_function"
+    withTable(tableName) {
+      val stringValue = "0"
+      val stringLiteral = "'0'"
+      val stringCol = "string_col"
+      val longValue = 0L
+      val longCol = "long_col"
+      val binaryValue = Array[Byte](0x30)
+      val binaryLiteral = "x'30'"
+      val binaryCol = "binary_col"
+      val doubleValue = 0.0
+      val doubleLiteral = "0.0"
+      val doubleCol = "double_col"
+      // CREATE table to use types defined in Spark SQL
+      sql(
+        s"CREATE TABLE $tableName ($stringCol STRING, $longCol LONG, " +
+          s"$binaryCol BINARY, $doubleCol DOUBLE)")
+      sql(
+        s"INSERT INTO $tableName VALUES($stringLiteral, $longValue, 
$binaryLiteral, $doubleValue)")

Review Comment:
   No, it is not possible. 
   
   As Spark doesn't specify collation for tables and columns, the collation of 
string columns is always the one defined in `collation_database`. On the other 
hand, the collation of string resulted from cast is the same as 
`collation_connection`.
   
   `MariaDB Connector/J` uses different `collation_connection` 
`utf8mb4_unicode_ci ` whereas the `MySQL server 9.1.0` uses `utf8mb4_0900_ai_ci 
` for `collation_database`. So we need to use string literal which also uses 
collation defined in `collation_connection` to make sure we are comparing two 
strings in the same collation.
   
   For the case `MySQL Connector/J` is used, it uses the same collation for 
connection as collation for database on the server side, therefore we can use 
the column directly.
   
   Prior to this commit, we used string literal which behave the same on both 
MySQL and MariaDB connectors. With this commit, we are doing it differently to 
compare with string literal for `MariaDB Connector/J` and compare with string 
column for `MySQL Connector/J`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to