[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19548


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread taroplus
Github user taroplus commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146160084
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -815,6 +815,12 @@ class JDBCSuite extends SparkFunSuite
   Some(DecimalType(DecimalType.MAX_PRECISION, 10)))
 assert(oracleDialect.getCatalystType(java.sql.Types.NUMERIC, 
"numeric", 0, null) ==
   Some(DecimalType(DecimalType.MAX_PRECISION, 10)))
+assert(oracleDialect.getCatalystType(100, "BINARY_FLOAT", 0, null) ==
--- End diff --

Made them constants defined in OracleDialect


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread taroplus
Github user taroplus commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146160039
  
--- Diff: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 ---
@@ -108,6 +112,10 @@ class OracleIntegrationSuite extends 
DockerJDBCIntegrationSuite with SharedSQLCo
 conn.prepareStatement(
   "INSERT INTO numerics VALUES (4, 1.23, 99)").executeUpdate();
 conn.commit();
+
+
+conn.prepareStatement("CREATE TABLE oracle_types (d BINARY_DOUBLE, f 
BINARY_FLOAT)").executeUpdate();
--- End diff --

removed trailing semi-colons throughout this file


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread taroplus
Github user taroplus commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146160051
  
--- Diff: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 ---
@@ -307,4 +315,32 @@ class OracleIntegrationSuite extends 
DockerJDBCIntegrationSuite with SharedSQLCo
 assert(values.getInt(1).equals(1))
 assert(values.getBoolean(2).equals(false))
   }
+
+  test("SPARK-22303: handle BINARY_DOUBLE and BINARY_FLOAT as DoubleType 
and FloatType") {
+val tableName = "oracle_types"
+val schema = StructType(Seq(
+  StructField("d", DoubleType, true),
+  StructField("f", FloatType, true)))
+val props = new Properties()
+
+// write it back to the table (append mode)
+val data = spark.sparkContext.parallelize(Seq(Row(1.1D, 2.2F)))
--- End diff --

updated


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread taroplus
Github user taroplus commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146160013
  
--- Diff: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 ---
@@ -50,9 +52,11 @@ import org.apache.spark.tags.DockerTest
 @DockerTest
 class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with 
SharedSQLContext {
   import testImplicits._
+  // To make === between double tolerate inexact values
+  implicit val doubleEquality = 
TolerantNumerics.tolerantDoubleEquality(0.01)
--- End diff --

removed this line and the test still passes


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread taroplus
Github user taroplus commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146159980
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala ---
@@ -28,25 +28,28 @@ private case object OracleDialect extends JdbcDialect {
 
   override def getCatalystType(
   sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): 
Option[DataType] = {
-if (sqlType == Types.NUMERIC) {
-  val scale = if (null != md) md.build().getLong("scale") else 0L
-  size match {
-// Handle NUMBER fields that have no precision/scale in special way
-// because JDBC ResultSetMetaData converts this to 0 precision and 
-127 scale
-// For more details, please see
-// https://github.com/apache/spark/pull/8780#issuecomment-145598968
-// and
-// https://github.com/apache/spark/pull/8780#issuecomment-144541760
-case 0 => Option(DecimalType(DecimalType.MAX_PRECISION, 10))
-// Handle FLOAT fields in a special way because JDBC 
ResultSetMetaData converts
-// this to NUMERIC with -127 scale
-// Not sure if there is a more robust way to identify the field as 
a float (or other
-// numeric types that do not specify a scale.
-case _ if scale == -127L => 
Option(DecimalType(DecimalType.MAX_PRECISION, 10))
-case _ => None
-  }
-} else {
-  None
+sqlType match {
+  case Types.NUMERIC =>
+val scale = if (null != md) md.build().getLong("scale") else 0L
+size match {
+  // Handle NUMBER fields that have no precision/scale in special 
way
+  // because JDBC ResultSetMetaData converts this to 0 precision 
and -127 scale
+  // For more details, please see
+  // 
https://github.com/apache/spark/pull/8780#issuecomment-145598968
+  // and
+  // 
https://github.com/apache/spark/pull/8780#issuecomment-144541760
+  case 0 => Option(DecimalType(DecimalType.MAX_PRECISION, 10))
+  // Handle FLOAT fields in a special way because JDBC 
ResultSetMetaData converts
+  // this to NUMERIC with -127 scale
+  // Not sure if there is a more robust way to identify the field 
as a float (or other
+  // numeric types that do not specify a scale.
+  case _ if scale == -127L => 
Option(DecimalType(DecimalType.MAX_PRECISION, 10))
+  case _ => None
+}
+  case -101 => Some(TimestampType) // Value for Timestamp with Time 
Zone in Oracle
+  case 100 => Some(FloatType) // Value for OracleTypes.BINARY_FLOAT
+  case 101 => Some(DoubleType) // Value for OracleTypes.BINARY_DOUBLE
--- End diff --

This should match java's double / float definition


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146147276
  
--- Diff: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 ---
@@ -108,6 +112,10 @@ class OracleIntegrationSuite extends 
DockerJDBCIntegrationSuite with SharedSQLCo
 conn.prepareStatement(
   "INSERT INTO numerics VALUES (4, 1.23, 99)").executeUpdate();
 conn.commit();
+
+
+conn.prepareStatement("CREATE TABLE oracle_types (d BINARY_DOUBLE, f 
BINARY_FLOAT)").executeUpdate();
--- End diff --

No semicolon at the end of lines


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146147286
  
--- Diff: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 ---
@@ -50,9 +52,11 @@ import org.apache.spark.tags.DockerTest
 @DockerTest
 class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with 
SharedSQLContext {
   import testImplicits._
+  // To make === between double tolerate inexact values
+  implicit val doubleEquality = 
TolerantNumerics.tolerantDoubleEquality(0.01)
--- End diff --

Curious why it would be that inequal somewhere in these tests?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146147293
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -815,6 +815,12 @@ class JDBCSuite extends SparkFunSuite
   Some(DecimalType(DecimalType.MAX_PRECISION, 10)))
 assert(oracleDialect.getCatalystType(java.sql.Types.NUMERIC, 
"numeric", 0, null) ==
   Some(DecimalType(DecimalType.MAX_PRECISION, 10)))
+assert(oracleDialect.getCatalystType(100, "BINARY_FLOAT", 0, null) ==
--- End diff --

Let's define constants somewhere suitable for 100/100/-101


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-22 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146147267
  
--- Diff: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 ---
@@ -307,4 +315,32 @@ class OracleIntegrationSuite extends 
DockerJDBCIntegrationSuite with SharedSQLCo
 assert(values.getInt(1).equals(1))
 assert(values.getBoolean(2).equals(false))
   }
+
+  test("SPARK-22303: handle BINARY_DOUBLE and BINARY_FLOAT as DoubleType 
and FloatType") {
+val tableName = "oracle_types"
+val schema = StructType(Seq(
+  StructField("d", DoubleType, true),
+  StructField("f", FloatType, true)))
+val props = new Properties()
+
+// write it back to the table (append mode)
+val data = spark.sparkContext.parallelize(Seq(Row(1.1D, 2.2F)))
--- End diff --

Nit: I think it's more conventional to write "1.1" and "2.2f"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19548#discussion_r146120958
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala ---
@@ -28,25 +28,28 @@ private case object OracleDialect extends JdbcDialect {
 
   override def getCatalystType(
   sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): 
Option[DataType] = {
-if (sqlType == Types.NUMERIC) {
-  val scale = if (null != md) md.build().getLong("scale") else 0L
-  size match {
-// Handle NUMBER fields that have no precision/scale in special way
-// because JDBC ResultSetMetaData converts this to 0 precision and 
-127 scale
-// For more details, please see
-// https://github.com/apache/spark/pull/8780#issuecomment-145598968
-// and
-// https://github.com/apache/spark/pull/8780#issuecomment-144541760
-case 0 => Option(DecimalType(DecimalType.MAX_PRECISION, 10))
-// Handle FLOAT fields in a special way because JDBC 
ResultSetMetaData converts
-// this to NUMERIC with -127 scale
-// Not sure if there is a more robust way to identify the field as 
a float (or other
-// numeric types that do not specify a scale.
-case _ if scale == -127L => 
Option(DecimalType(DecimalType.MAX_PRECISION, 10))
-case _ => None
-  }
-} else {
-  None
+sqlType match {
+  case Types.NUMERIC =>
+val scale = if (null != md) md.build().getLong("scale") else 0L
+size match {
+  // Handle NUMBER fields that have no precision/scale in special 
way
+  // because JDBC ResultSetMetaData converts this to 0 precision 
and -127 scale
+  // For more details, please see
+  // 
https://github.com/apache/spark/pull/8780#issuecomment-145598968
+  // and
+  // 
https://github.com/apache/spark/pull/8780#issuecomment-144541760
+  case 0 => Option(DecimalType(DecimalType.MAX_PRECISION, 10))
+  // Handle FLOAT fields in a special way because JDBC 
ResultSetMetaData converts
+  // this to NUMERIC with -127 scale
+  // Not sure if there is a more robust way to identify the field 
as a float (or other
+  // numeric types that do not specify a scale.
+  case _ if scale == -127L => 
Option(DecimalType(DecimalType.MAX_PRECISION, 10))
+  case _ => None
+}
+  case -101 => Some(TimestampType) // Value for Timestamp with Time 
Zone in Oracle
+  case 100 => Some(FloatType) // Value for OracleTypes.BINARY_FLOAT
+  case 101 => Some(DoubleType) // Value for OracleTypes.BINARY_DOUBLE
--- End diff --

Question: Is the value range of `OracleTypes.BINARY_DOUBLE` identical to 
our Spark Double type?

Also, `OracleTypes.BINARY_FLOAT ` with Spark Float type


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19548: [SPARK-22303][SQL] Handle Oracle specific jdbc ty...

2017-10-21 Thread taroplus
GitHub user taroplus opened a pull request:

https://github.com/apache/spark/pull/19548

[SPARK-22303][SQL] Handle Oracle specific jdbc types in OracleDialect

TIMESTAMP (-101), BINARY_DOUBLE (101) and BINARY_FLOAT (100) are handled in 
OracleDialect

## What changes were proposed in this pull request?

When a oracle table contains columns whose type is BINARY_FLOAT or 
BINARY_DOUBLE, spark sql fails to load a table with SQLException

```
java.sql.SQLException: Unsupported type 101
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292)
 at scala.Option.getOrElse(Option.scala:121)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:113)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47)
 at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)
```

## How was this patch tested?

I updated a UT which covers type conversion test for types (-101, 100, 
101), on top of that I tested this change against actual table with those 
columns and it was able to read and write to the table. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/taroplus/spark oracle_sql_types_101

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19548.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19548


commit 51c616c6e501ad539f4e18cf604250a66edf1a2e
Author: Kohki Nishio 
Date:   2017-10-22T02:55:28Z

Handle Oracle specific jdbc types in OracleDialect

TIMESTAMP (-101), BINARY_DOUBLE (101) and BINARY_FLOAT (100) are handled in 
OracleDialect




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org