sunxiaoguang commented on code in PR #49453:
URL: https://github.com/apache/spark/pull/49453#discussion_r1936895094
##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/MySQLIntegrationSuite.scala:
##########
@@ -241,6 +241,84 @@ class MySQLIntegrationSuite extends
DockerJDBCIntegrationV2Suite with V2JDBCTest
assert(rows10(0).getString(0) === "amy")
assert(rows10(1).getString(0) === "alex")
}
+
+ // MySQL Connector/J uses collation 'utf8mb4_0900_ai_ci' as collation for
connection.
+ // The MySQL server 9.1.0 uses collation 'utf8mb4_0900_ai_ci' for database
by default.
+ // This method uses string colume directly as the result of cast has the
same collation.
+ def testCastStringTarget(stringLiteral: String, stringCol: String): String =
stringCol
+
+ test("SPARK-50793: MySQL JDBC Connector failed to cast some types") {
+ val tableName = catalogName + ".test_cast_function"
+ withTable(tableName) {
+ val stringValue = "0"
+ val stringLiteral = "'0'"
+ val stringCol = "string_col"
+ val longValue = 0L
+ val longCol = "long_col"
+ val binaryValue = Array[Byte](0x30)
+ val binaryLiteral = "x'30'"
+ val binaryCol = "binary_col"
+ val doubleValue = 0.0
+ val doubleLiteral = "0.0"
+ val doubleCol = "double_col"
+ // CREATE table to use types defined in Spark SQL
+ sql(
+ s"CREATE TABLE $tableName ($stringCol STRING, $longCol LONG, " +
+ s"$binaryCol BINARY, $doubleCol DOUBLE)")
+ sql(
+ s"INSERT INTO $tableName VALUES($stringLiteral, $longValue,
$binaryLiteral, $doubleValue)")
Review Comment:
No, it is not possible.
As Spark doesn't specify collation for tables and columns, the collation of
string columns is always the one defined in `collation_database`. On the other
hand, the collation of string resulted from cast is the same as
`collation_connection`.
`MariaDB Connector/J` uses different `collation_connection`
`utf8mb4_unicode_ci ` whereas the `MySQL server 9.1.0` uses `utf8mb4_0900_ai_ci
` for `collation_database`. So we need to use string literal which also uses
collation defined in `collation_connection` to make sure we are comparing two
strings in the same collation.
For the case `MySQL Connector/J` is used, it uses the same collation for
connection as collation for database on the server side, therefore we can use
the column directly.
Prior to this commit, we used string literal which behave the same on both
MySQL and MariaDB connectors. With this commit, we are doing it differently to
compare with string literal for `MariaDB Connector/J` and compare with string
column for `MySQL Connector/J`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]