[
https://issues.apache.org/jira/browse/SPARK-47707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
王俊博 updated SPARK-47707:
------------------------
Description:
MySQL JDBC driver `mysql-connector-java-5.1.49.jar` converts JSON type into
Types.CHAR with a precision of Int.Max.
When receiving CHAR with Int.Max precision, Spark executor will throw an error
of `java.lang.OutOfMemoryError: Requested array size exceeds VM limit `.
For {{mysql-connector-java-5.1.49.jar}} json sqlType is {{Char}} and precision
is {{{}Int.Max{}}}.
For {{mysql-connector-java-8.0.16.jar}} json sqlType is {{LONGVARCHAR}} and
precision is {{{}Int.Max{}}}.
Spark use {{mysql-connector-java-8.0.16.jar}} is right.
{code:java}
private def getCatalystType(
sqlType: Int,
typeName: String,
precision: Int,
scale: Int,
signed: Boolean,
isTimestampNTZ: Boolean): DataType = sqlType match{
...
case java.sql.Types.LONGNVARCHAR => StringType
...
}
{code}
If compatibility with 5.1.49 is not required, the current code is sufficient.
was:
MySQL JDBC drivers `mysql-connector-java-5.1.49.jar` converts JSON type into
Types.CHAR with a precision of Int.Max.
When receiving CHAR with Int.Max precision, Spark executor will throw an error
of `java.lang.OutOfMemoryError: Requested array size exceeds VM limit `.
For {{mysql-connector-java-5.1.49.jar}} json sqlType is {{Char}} and precision
is {{{}Int.Max{}}}.
For {{mysql-connector-java-8.0.16.jar}} json sqlType is {{LONGVARCHAR}} and
precision is {{{}Int.Max{}}}.
Spark use {{mysql-connector-java-8.0.16.jar}} is right.
{code:java}
private def getCatalystType(
sqlType: Int,
typeName: String,
precision: Int,
scale: Int,
signed: Boolean,
isTimestampNTZ: Boolean): DataType = sqlType match{
...
case java.sql.Types.LONGNVARCHAR => StringType
...
}
{code}
If compatibility with 5.1.49 is not required, the current code is sufficient.
> Special handling of JSON type for MySQL connector
> -------------------------------------------------
>
> Key: SPARK-47707
> URL: https://issues.apache.org/jira/browse/SPARK-47707
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 3.4.0
> Environment: mysql-connector-java-5.1.49.jar
> spark-3.5.0
> Reporter: 王俊博
> Priority: Minor
>
> MySQL JDBC driver `mysql-connector-java-5.1.49.jar` converts JSON type into
> Types.CHAR with a precision of Int.Max.
> When receiving CHAR with Int.Max precision, Spark executor will throw an
> error of `java.lang.OutOfMemoryError: Requested array size exceeds VM limit `.
> For {{mysql-connector-java-5.1.49.jar}} json sqlType is {{Char}} and
> precision is {{{}Int.Max{}}}.
> For {{mysql-connector-java-8.0.16.jar}} json sqlType is {{LONGVARCHAR}} and
> precision is {{{}Int.Max{}}}.
> Spark use {{mysql-connector-java-8.0.16.jar}} is right.
> {code:java}
> private def getCatalystType(
> sqlType: Int,
> typeName: String,
> precision: Int,
> scale: Int,
> signed: Boolean,
> isTimestampNTZ: Boolean): DataType = sqlType match{
> ...
> case java.sql.Types.LONGNVARCHAR => StringType
> ...
> }
> {code}
>
> If compatibility with 5.1.49 is not required, the current code is sufficient.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]