[GitHub] spark pull request: [SPARK-10655][SQL] Adding additional data type...

sureshthalamati Tue, 20 Oct 2015 14:58:02 -0700

Github user sureshthalamati commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9162#discussion_r42560386
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
    @@ -253,17 +253,33 @@ case object MySQLDialect extends JdbcDialect {
     
     /**
      * :: DeveloperApi ::
    - * Default DB2 dialect, mapping string/boolean on write to valid DB2 types.
    - * By default string, and boolean gets mapped to db2 invalid types TEXT, 
and BIT(1).
    + * Default DB2 dialect, mapping string/boolean/short/byte/decimal on write 
and
    + * real/decfloat/xml on read to valid DB2 types. By default string, and 
boolean
    + * gets mapped to db2 invalid types TEXT, and BIT(1).
      */
     @DeveloperApi
     case object DB2Dialect extends JdbcDialect {
     
       override def canHandle(url: String): Boolean = url.startsWith("jdbc:db2")
     
    +  override def getCatalystType(
    +      sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): 
Option[DataType] = {
    +    if (sqlType == Types.REAL) {
    +      Option(FloatType)
    +    } else if (sqlType == Types.OTHER && typeName.equals("DECFLOAT")) {
    +      Option(DecimalType(38, 18))
    +    } else if (sqlType == Types.OTHER && typeName.equals("XML")) {
    +      Option(StringType)
    +    } else None
    +  }
    +
       override def getJDBCType(dt: DataType): Option[JdbcType] = dt match {
         case StringType => Some(JdbcType("CLOB", java.sql.Types.CLOB))
         case BooleanType => Some(JdbcType("CHAR(1)", java.sql.Types.CHAR))
    +    case ShortType | ByteType => Some(JdbcType("SMALLINT", 
java.sql.Types.SMALLINT))
    +    // DB2 maximum precision is 31. If the precision is greater than 31 
map to DB2 max.
    +    case (t: DecimalType) if (t.precision > 31) =>
    +      Some(JdbcType("DECIMAL(31,2)", java.sql.Types.DECIMAL))
    --- End diff --
    
    Using this mapping DB2 will throw error if the precision of the value being 
written is > 31 during insert execution. But if the value scale is higher than 
2 , value will rounded to scale of 2.
    
    Other alternative was to let it fail with error when create table is 
executed (current behavior). I  was  not sure how common the decimal type 
system default (38, 18) is used to create the data frame. I thought it is 
better to map to DB2 max precision, instead of failing with error ,considering 
there is no way to write decimals of precision > 31 to DB2.  
    
    If you think it is better to fail on create , instead of surprises during 
execution. I will update the patch. 
    
     I am also working on creating pull request (SPARK-10849) to  allow users 
to specify the target database column type.  This will allow  user to specify 
the decimal precision, and scale of their choice in these kind of  scenarios.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-10655][SQL] Adding additional data type...

Reply via email to