[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

bdolbeare Mon, 05 Oct 2015 10:02:18 -0700

Github user bdolbeare commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-145598968
  
    The problem with Oracle is that you can define numbers without providing 
precision or scale:
    
        column_name NUMBER  (this is the only case that doesn't work very well 
for Oracle support)
                has a precision of 0 and scale of -127 in JDBC ResultSetMetaData
                "If a precision is not specified, the column stores values as 
given."  
http://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#i16209       
         
                
        column_name NUMBER(10) 
                has a precision of 10 and scale of 0 in JDBC ResultSetMetaData
                
        column_name NUMBER(10,2) 
                has a precision of 10 and scale of 2 in JDBC ResultSetMetaData  
    
    I think the best solution is to handle this in a OracleDialect since this 
is a quirk of Oracle.  I've done that for my own code but it would be nice to 
have two changes in Spark:
    
    1)  Access to more of the metadata fields (e.g. scale) in the 
dialect.getCatalystType call (currently the precision is provided but the scale 
is not)
    2)  Change line 406 in JDBCRDD to support creating a Decimal without a 
predefined precision/scale.  It seems that this would work in cases where there 
is a consistent
    precision/scale for a field and also this Oracle nuance where the 
precision/scale differ per row.
    
    
    For now, this is what I've done in my own OracleDialect:
    
        object OracleDialect extends JdbcDialect {
          override def getCatalystType(sqlType: Int, typeName: String, size: 
Int, md: MetadataBuilder): Option[DataType] = {
                // Handle NUMBER fields that have no precision/scale in special 
way because JDBC ResultSetMetaData converts this to 0 procision and -127 scale
                if (sqlType == Types.NUMERIC && size == 0) {
                  // This is sub-optimal as we have to pick a precision/scale 
in advance whereas the data in Oracle is allowed 
                  //  to have different precision/scale for each value.  This 
conversion works in our domain for now though we 
                  //  need a more durable solution.  Look into changing JDBCRDD 
(line 406):
                  //    FROM:  mutableRow.update(i, Decimal(decimalVal, p, s))
                  //    TO:  mutableRow.update(i, Decimal(decimalVal))
                  Some(DecimalType(DecimalType.MAX_PRECISION, 10))
                } // Handle Timestamp with timezone (for now we are just 
converting this to a string with default format)
                else if (sqlType == -101) {
                  Some(StringType)
                } else None
          }
        }




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Reply via email to