[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19548 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user taroplus commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146160084 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -815,6 +815,12 @@ class JDBCSuite extends SparkFunSuite Some(DecimalType(DecimalType.MAX_PRECISION, 10))) assert(oracleDialect.getCatalystType(java.sql.Types.NUMERIC, "numeric", 0, null) == Some(DecimalType(DecimalType.MAX_PRECISION, 10))) +assert(oracleDialect.getCatalystType(100, "BINARY_FLOAT", 0, null) == --- End diff -- Made them constants defined in OracleDialect --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user taroplus commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146160039 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -108,6 +112,10 @@ class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLCo conn.prepareStatement( "INSERT INTO numerics VALUES (4, 1.23, 99)").executeUpdate(); conn.commit(); + + +conn.prepareStatement("CREATE TABLE oracle_types (d BINARY_DOUBLE, f BINARY_FLOAT)").executeUpdate(); --- End diff -- removed trailing semi-colons throughout this file --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user taroplus commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146160051 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -307,4 +315,32 @@ class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLCo assert(values.getInt(1).equals(1)) assert(values.getBoolean(2).equals(false)) } + + test("SPARK-22303: handle BINARY_DOUBLE and BINARY_FLOAT as DoubleType and FloatType") { +val tableName = "oracle_types" +val schema = StructType(Seq( + StructField("d", DoubleType, true), + StructField("f", FloatType, true))) +val props = new Properties() + +// write it back to the table (append mode) +val data = spark.sparkContext.parallelize(Seq(Row(1.1D, 2.2F))) --- End diff -- updated --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user taroplus commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146160013 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -50,9 +52,11 @@ import org.apache.spark.tags.DockerTest @DockerTest class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLContext { import testImplicits._ + // To make === between double tolerate inexact values + implicit val doubleEquality = TolerantNumerics.tolerantDoubleEquality(0.01) --- End diff -- removed this line and the test still passes --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user taroplus commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146159980 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala --- @@ -28,25 +28,28 @@ private case object OracleDialect extends JdbcDialect { override def getCatalystType( sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = { -if (sqlType == Types.NUMERIC) { - val scale = if (null != md) md.build().getLong("scale") else 0L - size match { -// Handle NUMBER fields that have no precision/scale in special way -// because JDBC ResultSetMetaData converts this to 0 precision and -127 scale -// For more details, please see -// https://github.com/apache/spark/pull/8780#issuecomment-145598968 -// and -// https://github.com/apache/spark/pull/8780#issuecomment-144541760 -case 0 => Option(DecimalType(DecimalType.MAX_PRECISION, 10)) -// Handle FLOAT fields in a special way because JDBC ResultSetMetaData converts -// this to NUMERIC with -127 scale -// Not sure if there is a more robust way to identify the field as a float (or other -// numeric types that do not specify a scale. -case _ if scale == -127L => Option(DecimalType(DecimalType.MAX_PRECISION, 10)) -case _ => None - } -} else { - None +sqlType match { + case Types.NUMERIC => +val scale = if (null != md) md.build().getLong("scale") else 0L +size match { + // Handle NUMBER fields that have no precision/scale in special way + // because JDBC ResultSetMetaData converts this to 0 precision and -127 scale + // For more details, please see + // https://github.com/apache/spark/pull/8780#issuecomment-145598968 + // and + // https://github.com/apache/spark/pull/8780#issuecomment-144541760 + case 0 => Option(DecimalType(DecimalType.MAX_PRECISION, 10)) + // Handle FLOAT fields in a special way because JDBC ResultSetMetaData converts + // this to NUMERIC with -127 scale + // Not sure if there is a more robust way to identify the field as a float (or other + // numeric types that do not specify a scale. + case _ if scale == -127L => Option(DecimalType(DecimalType.MAX_PRECISION, 10)) + case _ => None +} + case -101 => Some(TimestampType) // Value for Timestamp with Time Zone in Oracle + case 100 => Some(FloatType) // Value for OracleTypes.BINARY_FLOAT + case 101 => Some(DoubleType) // Value for OracleTypes.BINARY_DOUBLE --- End diff -- This should match java's double / float definition --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146147276 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -108,6 +112,10 @@ class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLCo conn.prepareStatement( "INSERT INTO numerics VALUES (4, 1.23, 99)").executeUpdate(); conn.commit(); + + +conn.prepareStatement("CREATE TABLE oracle_types (d BINARY_DOUBLE, f BINARY_FLOAT)").executeUpdate(); --- End diff -- No semicolon at the end of lines --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146147286 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -50,9 +52,11 @@ import org.apache.spark.tags.DockerTest @DockerTest class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLContext { import testImplicits._ + // To make === between double tolerate inexact values + implicit val doubleEquality = TolerantNumerics.tolerantDoubleEquality(0.01) --- End diff -- Curious why it would be that inequal somewhere in these tests? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146147293 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -815,6 +815,12 @@ class JDBCSuite extends SparkFunSuite Some(DecimalType(DecimalType.MAX_PRECISION, 10))) assert(oracleDialect.getCatalystType(java.sql.Types.NUMERIC, "numeric", 0, null) == Some(DecimalType(DecimalType.MAX_PRECISION, 10))) +assert(oracleDialect.getCatalystType(100, "BINARY_FLOAT", 0, null) == --- End diff -- Let's define constants somewhere suitable for 100/100/-101 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146147267 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -307,4 +315,32 @@ class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLCo assert(values.getInt(1).equals(1)) assert(values.getBoolean(2).equals(false)) } + + test("SPARK-22303: handle BINARY_DOUBLE and BINARY_FLOAT as DoubleType and FloatType") { +val tableName = "oracle_types" +val schema = StructType(Seq( + StructField("d", DoubleType, true), + StructField("f", FloatType, true))) +val props = new Properties() + +// write it back to the table (append mode) +val data = spark.sparkContext.parallelize(Seq(Row(1.1D, 2.2F))) --- End diff -- Nit: I think it's more conventional to write "1.1" and "2.2f" --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19548#discussion_r146120958 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala --- @@ -28,25 +28,28 @@ private case object OracleDialect extends JdbcDialect { override def getCatalystType( sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = { -if (sqlType == Types.NUMERIC) { - val scale = if (null != md) md.build().getLong("scale") else 0L - size match { -// Handle NUMBER fields that have no precision/scale in special way -// because JDBC ResultSetMetaData converts this to 0 precision and -127 scale -// For more details, please see -// https://github.com/apache/spark/pull/8780#issuecomment-145598968 -// and -// https://github.com/apache/spark/pull/8780#issuecomment-144541760 -case 0 => Option(DecimalType(DecimalType.MAX_PRECISION, 10)) -// Handle FLOAT fields in a special way because JDBC ResultSetMetaData converts -// this to NUMERIC with -127 scale -// Not sure if there is a more robust way to identify the field as a float (or other -// numeric types that do not specify a scale. -case _ if scale == -127L => Option(DecimalType(DecimalType.MAX_PRECISION, 10)) -case _ => None - } -} else { - None +sqlType match { + case Types.NUMERIC => +val scale = if (null != md) md.build().getLong("scale") else 0L +size match { + // Handle NUMBER fields that have no precision/scale in special way + // because JDBC ResultSetMetaData converts this to 0 precision and -127 scale + // For more details, please see + // https://github.com/apache/spark/pull/8780#issuecomment-145598968 + // and + // https://github.com/apache/spark/pull/8780#issuecomment-144541760 + case 0 => Option(DecimalType(DecimalType.MAX_PRECISION, 10)) + // Handle FLOAT fields in a special way because JDBC ResultSetMetaData converts + // this to NUMERIC with -127 scale + // Not sure if there is a more robust way to identify the field as a float (or other + // numeric types that do not specify a scale. + case _ if scale == -127L => Option(DecimalType(DecimalType.MAX_PRECISION, 10)) + case _ => None +} + case -101 => Some(TimestampType) // Value for Timestamp with Time Zone in Oracle + case 100 => Some(FloatType) // Value for OracleTypes.BINARY_FLOAT + case 101 => Some(DoubleType) // Value for OracleTypes.BINARY_DOUBLE --- End diff -- Question: Is the value range of `OracleTypes.BINARY_DOUBLE` identical to our Spark Double type? Also, `OracleTypes.BINARY_FLOAT ` with Spark Float type --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...
GitHub user taroplus opened a pull request: https://github.com/apache/spark/pull/19548 [SPARK-22303][SQL] Handle Oracle specific jdbc types in OracleDialect TIMESTAMP (-101), BINARY_DOUBLE (101) and BINARY_FLOAT (100) are handled in OracleDialect ## What changes were proposed in this pull request? When a oracle table contains columns whose type is BINARY_FLOAT or BINARY_DOUBLE, spark sql fails to load a table with SQLException ``` java.sql.SQLException: Unsupported type 101 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:113) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) ``` ## How was this patch tested? I updated a UT which covers type conversion test for types (-101, 100, 101), on top of that I tested this change against actual table with those columns and it was able to read and write to the table. You can merge this pull request into a Git repository by running: $ git pull https://github.com/taroplus/spark oracle_sql_types_101 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19548.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19548 commit 51c616c6e501ad539f4e18cf604250a66edf1a2e Author: Kohki NishioDate: 2017-10-22T02:55:28Z Handle Oracle specific jdbc types in OracleDialect TIMESTAMP (-101), BINARY_DOUBLE (101) and BINARY_FLOAT (100) are handled in OracleDialect --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org