Github user sureshthalamati commented on a diff in the pull request:
https://github.com/apache/spark/pull/9162#discussion_r42560386
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -253,17 +253,33 @@ case object MySQLDialect extends JdbcDialect {
/**
* :: DeveloperApi ::
- * Default DB2 dialect, mapping string/boolean on write to valid DB2 types.
- * By default string, and boolean gets mapped to db2 invalid types TEXT,
and BIT(1).
+ * Default DB2 dialect, mapping string/boolean/short/byte/decimal on write
and
+ * real/decfloat/xml on read to valid DB2 types. By default string, and
boolean
+ * gets mapped to db2 invalid types TEXT, and BIT(1).
*/
@DeveloperApi
case object DB2Dialect extends JdbcDialect {
override def canHandle(url: String): Boolean = url.startsWith("jdbc:db2")
+ override def getCatalystType(
+ sqlType: Int, typeName: String, size: Int, md: MetadataBuilder):
Option[DataType] = {
+ if (sqlType == Types.REAL) {
+ Option(FloatType)
+ } else if (sqlType == Types.OTHER && typeName.equals("DECFLOAT")) {
+ Option(DecimalType(38, 18))
+ } else if (sqlType == Types.OTHER && typeName.equals("XML")) {
+ Option(StringType)
+ } else None
+ }
+
override def getJDBCType(dt: DataType): Option[JdbcType] = dt match {
case StringType => Some(JdbcType("CLOB", java.sql.Types.CLOB))
case BooleanType => Some(JdbcType("CHAR(1)", java.sql.Types.CHAR))
+ case ShortType | ByteType => Some(JdbcType("SMALLINT",
java.sql.Types.SMALLINT))
+ // DB2 maximum precision is 31. If the precision is greater than 31
map to DB2 max.
+ case (t: DecimalType) if (t.precision > 31) =>
+ Some(JdbcType("DECIMAL(31,2)", java.sql.Types.DECIMAL))
--- End diff --
Using this mapping DB2 will throw error if the precision of the value being
written is > 31 during insert execution. But if the value scale is higher than
2 , value will rounded to scale of 2.
Other alternative was to let it fail with error when create table is
executed (current behavior). I was not sure how common the decimal type
system default (38, 18) is used to create the data frame. I thought it is
better to map to DB2 max precision, instead of failing with error ,considering
there is no way to write decimals of precision > 31 to DB2.
If you think it is better to fail on create , instead of surprises during
execution. I will update the patch.
I am also working on creating pull request (SPARK-10849) to allow users
to specify the target database column type. This will allow user to specify
the decimal precision, and scale of their choice in these kind of scenarios.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]