Github user bdolbeare commented on the pull request:
https://github.com/apache/spark/pull/8780#issuecomment-145598968
The problem with Oracle is that you can define numbers without providing
precision or scale:
column_name NUMBER (this is the only case that doesn't work very well
for Oracle support)
has a precision of 0 and scale of -127 in JDBC ResultSetMetaData
"If a precision is not specified, the column stores values as
given."
http://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#i16209
column_name NUMBER(10)
has a precision of 10 and scale of 0 in JDBC ResultSetMetaData
column_name NUMBER(10,2)
has a precision of 10 and scale of 2 in JDBC ResultSetMetaData
I think the best solution is to handle this in a OracleDialect since this
is a quirk of Oracle. I've done that for my own code but it would be nice to
have two changes in Spark:
1) Access to more of the metadata fields (e.g. scale) in the
dialect.getCatalystType call (currently the precision is provided but the scale
is not)
2) Change line 406 in JDBCRDD to support creating a Decimal without a
predefined precision/scale. It seems that this would work in cases where there
is a consistent
precision/scale for a field and also this Oracle nuance where the
precision/scale differ per row.
For now, this is what I've done in my own OracleDialect:
object OracleDialect extends JdbcDialect {
override def getCatalystType(sqlType: Int, typeName: String, size:
Int, md: MetadataBuilder): Option[DataType] = {
// Handle NUMBER fields that have no precision/scale in special
way because JDBC ResultSetMetaData converts this to 0 procision and -127 scale
if (sqlType == Types.NUMERIC && size == 0) {
// This is sub-optimal as we have to pick a precision/scale
in advance whereas the data in Oracle is allowed
// to have different precision/scale for each value. This
conversion works in our domain for now though we
// need a more durable solution. Look into changing JDBCRDD
(line 406):
// FROM: mutableRow.update(i, Decimal(decimalVal, p, s))
// TO: mutableRow.update(i, Decimal(decimalVal))
Some(DecimalType(DecimalType.MAX_PRECISION, 10))
} // Handle Timestamp with timezone (for now we are just
converting this to a string with default format)
else if (sqlType == -101) {
Some(StringType)
} else None
}
}
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]