subject:"\[GitHub\] spark pull request #20057\: \[SPARK\-22880\]\[SQL\] Add cascadeTruncate option to ..."

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-07-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20057


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-07-20 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r203955129
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -102,7 +102,12 @@ object JdbcUtils extends Logging {
 val dialect = JdbcDialects.get(options.url)
 val statement = conn.createStatement
 try {
-  statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  if (options.isCascadeTruncate.isDefined) {
+statement.executeUpdate(dialect.getTruncateQuery(options.table,
+  options.isCascadeTruncate))
+  } else {
+statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  }
--- End diff --

Sorry, missed this comment. I'll make the change (@gatorsmile I think this 
is the comment you meant, right?) 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-07-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r203949498
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -105,7 +105,12 @@ object JdbcUtils extends Logging {
 val statement = conn.createStatement
 try {
   statement.setQueryTimeout(options.queryTimeout)
-  statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  if (options.isCascadeTruncate.isDefined) {
+statement.executeUpdate(dialect.getTruncateQuery(options.table,
+  options.isCascadeTruncate))
+  } else {
+statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  }
--- End diff --

+1 to the above comment of @dongjoon-hyun 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-07-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r203948972
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -120,11 +121,26 @@ abstract class JdbcDialect extends Serializable {
* The SQL query that should be used to truncate a table. Dialects can 
override this method to
* return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other 
tables.
-   * @param table The name of the table.
+   * @param table The table to truncate
* @return The SQL query to use for truncating a table
*/
   @Since("2.3.0")
   def getTruncateQuery(table: String): String = {
+getTruncateQuery(table, isCascadingTruncateTable)
+  }
+
+  /**
+   * The SQL query that should be used to truncate a table. Dialects can 
override this method to
+   * return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
+   * a different query is used to prevent "TRUNCATE" affecting other 
tables.
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation
+   * @return The SQL query to use for truncating a table
+   */
+  @Since("2.4.0")
+  def getTruncateQuery(
+table: String,
+cascade: Option[Boolean] = isCascadingTruncateTable): String = {
--- End diff --

Nit: indent.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-03-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r172685440
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala ---
@@ -95,4 +95,21 @@ private case object OracleDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable()
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+cascade match {
+  case Some(true) => s"TRUNCATE TABLE $table CASCADE"
+  case _ => s"TRUNCATE TABLE $table"
+}
+  }
+
--- End diff --

Let's remove line 114.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-03-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r172685274
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -120,14 +121,30 @@ abstract class JdbcDialect extends Serializable {
* The SQL query that should be used to truncate a table. Dialects can 
override this method to
* return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other 
tables.
-   * @param table The name of the table.
+   * @param table The table to truncate
* @return The SQL query to use for truncating a table
*/
   @Since("2.3.0")
   def getTruncateQuery(table: String): String = {
+getTruncateQuery(table, isCascadingTruncateTable)
+  }
+
+  /**
+   * The SQL query that should be used to truncate a table. Dialects can 
override this method to
+   * return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
+   * a different query is used to prevent "TRUNCATE" affecting other 
tables.
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation
+   * @return The SQL query to use for truncating a table
+   */
+  @Since("2.4.0")
+  def getTruncateQuery(
+table: String,
+cascade: Option[Boolean] = isCascadingTruncateTable): String = {
 s"TRUNCATE TABLE $table"
   }
 
+
--- End diff --

Let's remove this redundant empty line addition.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169693135
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -102,7 +102,12 @@ object JdbcUtils extends Logging {
 val dialect = JdbcDialects.get(options.url)
 val statement = conn.createStatement
 try {
-  statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  if (options.isCascadeTruncate.isDefined) {
+statement.executeUpdate(dialect.getTruncateQuery(options.table,
+  options.isCascadeTruncate))
+  } else {
+statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  }
--- End diff --

Ping.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169692801
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -839,35 +839,68 @@ class JDBCSuite extends SparkFunSuite
   }
 
   test("table exists query by jdbc dialect") {
-val MySQL = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
-val Postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
+val mysql = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
+val postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
 val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
 val h2 = JdbcDialects.get(url)
 val derby = JdbcDialects.get("jdbc:derby:db")
 val table = "weblogs"
 val defaultQuery = s"SELECT * FROM $table WHERE 1=0"
 val limitQuery = s"SELECT 1 FROM $table LIMIT 1"
-assert(MySQL.getTableExistsQuery(table) == limitQuery)
-assert(Postgres.getTableExistsQuery(table) == limitQuery)
+assert(mysql.getTableExistsQuery(table) == limitQuery)
+assert(postgres.getTableExistsQuery(table) == limitQuery)
 assert(db2.getTableExistsQuery(table) == defaultQuery)
 assert(h2.getTableExistsQuery(table) == defaultQuery)
 assert(derby.getTableExistsQuery(table) == defaultQuery)
   }
 
   test("truncate table query by jdbc dialect") {
-val MySQL = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
-val Postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
+val mysql = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
+val postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
 val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
 val h2 = JdbcDialects.get(url)
 val derby = JdbcDialects.get("jdbc:derby:db")
+val oracle = JdbcDialects.get("jdbc:oracle://127.0.0.1/db")
+val teradata = JdbcDialects.get("jdbc:teradata://127.0.0.1/db")
+
 val table = "weblogs"
 val defaultQuery = s"TRUNCATE TABLE $table"
 val postgresQuery = s"TRUNCATE TABLE ONLY $table"
-assert(MySQL.getTruncateQuery(table) == defaultQuery)
-assert(Postgres.getTruncateQuery(table) == postgresQuery)
+val teradataQuery = s"DELETE FROM $table ALL"
+
+assert(mysql.getTruncateQuery(table) == defaultQuery)
+assert(postgres.getTruncateQuery(table) == postgresQuery)
 assert(db2.getTruncateQuery(table) == defaultQuery)
 assert(h2.getTruncateQuery(table) == defaultQuery)
 assert(derby.getTruncateQuery(table) == defaultQuery)
+assert(oracle.getTruncateQuery(table) == defaultQuery)
+assert(teradata.getTruncateQuery(table) == teradataQuery)
+  }
+
+  test("SPARK-22880: Truncate table with CASCADE by jdbc dialect") {
+// cascade in a truncate should only be applied for databases that 
support this,
+// even if the parameter is passed.
+val mysql = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
+val postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
+val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
+val h2 = JdbcDialects.get(url)
+val derby = JdbcDialects.get("jdbc:derby:db")
+val oracle = JdbcDialects.get("jdbc:oracle://127.0.0.1/db")
+val teradata = JdbcDialects.get("jdbc:teradata://127.0.0.1/db")
+
+val table = "weblogs"
+val defaultQuery = s"TRUNCATE TABLE $table"
+val postgresQuery = s"TRUNCATE TABLE ONLY $table CASCADE"
+val oracleQuery = s"TRUNCATE TABLE $table CASCADE"
+val teradataQuery = s"DELETE FROM $table ALL"
+
+assert(mysql.getTruncateQuery(table, Some(true)) == defaultQuery)
+assert(postgres.getTruncateQuery(table, Some(true)) == postgresQuery)
+assert(db2.getTruncateQuery(table, Some(true)) == defaultQuery)
+assert(h2.getTruncateQuery(table, Some(true)) == defaultQuery)
+assert(derby.getTruncateQuery(table, Some(true)) == defaultQuery)
+assert(oracle.getTruncateQuery(table, Some(true)) == oracleQuery)
+assert(teradata.getTruncateQuery(table, Some(true)) == teradataQuery)
--- End diff --

Let's group the same one like the following.
```
Seq(mysql, db2, h2, derby).foreach { dialect =>
assert(dialect.getTruncateQuery(table, Some(true)) == defaultQuery)
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169692604
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -839,35 +839,68 @@ class JDBCSuite extends SparkFunSuite
   }
 
   test("table exists query by jdbc dialect") {
-val MySQL = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
-val Postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
+val mysql = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
+val postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
 val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
 val h2 = JdbcDialects.get(url)
 val derby = JdbcDialects.get("jdbc:derby:db")
 val table = "weblogs"
 val defaultQuery = s"SELECT * FROM $table WHERE 1=0"
 val limitQuery = s"SELECT 1 FROM $table LIMIT 1"
-assert(MySQL.getTableExistsQuery(table) == limitQuery)
-assert(Postgres.getTableExistsQuery(table) == limitQuery)
+assert(mysql.getTableExistsQuery(table) == limitQuery)
+assert(postgres.getTableExistsQuery(table) == limitQuery)
 assert(db2.getTableExistsQuery(table) == defaultQuery)
 assert(h2.getTableExistsQuery(table) == defaultQuery)
 assert(derby.getTableExistsQuery(table) == defaultQuery)
   }
 
   test("truncate table query by jdbc dialect") {
--- End diff --

For this test case, we added `oracle` and `teradata`. It's okay to update 
this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169692336
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -805,13 +805,13 @@ class JDBCSuite extends SparkFunSuite
   }
 
   test("PostgresDialect type mapping") {
--- End diff --

This test case, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169692391
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -839,35 +839,68 @@ class JDBCSuite extends SparkFunSuite
   }
 
   test("table exists query by jdbc dialect") {
--- End diff --

this test case, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169692053
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -684,17 +684,17 @@ class JDBCSuite extends SparkFunSuite
   }
 
   test("quote column names by jdbc dialect") {
--- End diff --

I understand the indention is consistency, but let's keep the unrelated 
code untouched in this PR.
We can do refactoring PR later if needed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169689517
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,19 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
--- End diff --

Please add a function description that Teradata doesn't support cascading 
DELETE.
You may need to explain the lack of TRUNCATE syntax, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169688539
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,19 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
--- End diff --

Please add more description about Teradata's lack of `TRUNCATE` syntax 
here. Otherwise, people try to make a PR for this mistakenly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-21 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169583476
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,19 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The table to truncate.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for 
Teradata as it is unsupported
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+s"TRUNCATE TABLE $table"
--- End diff --

Thanks @klinvill for helping out :-)
@dongjoon-hyun I've made the changes to `TeradataDialect` and updated the 
tests accordingly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-20 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169530627
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,19 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The table to truncate.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for 
Teradata as it is unsupported
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+s"TRUNCATE TABLE $table"
--- End diff --

Thank you so much for confirming, @klinvill !

@danielvdende . Could you update `TeradataDialect` according to @klinvill 
's advice?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-20 Thread klinvill

Github user klinvill commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169526767
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,19 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The table to truncate.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for 
Teradata as it is unsupported
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+s"TRUNCATE TABLE $table"
--- End diff --

Hi @dongjoon-hyun, I was the original author of the TeradataDialect and 
@gatorsmile reviewed and committed it. You are correct, Teradata does not 
support the TRUNCATE statement. Instead Teradata uses a DELETE statement so you 
should be able to use `DELETE FROM $table ALL` instead of `TRUNCATE TABLE 
$table`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-20 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169443807
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,19 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The table to truncate.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for 
Teradata as it is unsupported
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+s"TRUNCATE TABLE $table"
--- End diff --

Gentle ping, @klinvill .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-19 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r169025602
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala ---
@@ -85,15 +85,24 @@ private object PostgresDialect extends JdbcDialect {
 s"SELECT 1 FROM $table LIMIT 1"
   }
 
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
   /**
-  * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
-  * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
-  * the Postgres dialect adds 'ONLY' to truncate only the table in question
-  * @param table The name of the table.
-  * @return The SQL query to use for truncating a table
-  */
-  override def getTruncateQuery(table: String): String = {
-s"TRUNCATE TABLE ONLY $table"
+   * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
+   * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
+   * the Postgres dialect adds 'ONLY' to truncate only the table in 
question
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the value of
+   *isCascadingTruncateTable()
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+cascade match {
+  case Some(true) => s"TRUNCATE TABLE ONLY $table CASCADE"
--- End diff --

Done :-)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-18 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168971182
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/DB2Dialect.scala ---
@@ -49,4 +49,17 @@ private object DB2Dialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The table to truncate.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for DB2 
as it is unsupported.
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+s"TRUNCATE TABLE $table"
+  }
--- End diff --

Since this is the same code from JdbcDialect, it seems we don't need to add 
this here.
Not only this DB2Dialect, but also DerbyDialect/MsSQLDialect/MySQLDialect.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-18 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168971074
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,19 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The table to truncate.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for 
Teradata as it is unsupported
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+s"TRUNCATE TABLE $table"
--- End diff --

Currently, `TeradataDialect` seems to generate a unsupported syntax because 
it inherits from `JdbcDialects`. Before doing this PR, we had better check this.

Ping, @klinvill and @gatorsmile who is the original author and committer of 
this code. Could you give us some comment for this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-18 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168970903
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,19 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The table to truncate.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for 
Teradata as it is unsupported
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+s"TRUNCATE TABLE $table"
--- End diff --

At the beginning, I saw `teradata` is missed in your test case. So, I asked 
to add that.

But, today, I did double-check that in order to review this line. 
Unfortunately, according to Teradata document, `TRUNCATE` syntax is not 
documented. And it's not supported according to some old community articles.

- Teradata SQL Reference
  - 
https://info.teradata.com/HTMLPubs/DB_TTU_16_00/index.html#page/SQL_Reference%2FB035-1146-160K%2Fpds1472240516459.html%23
- Community (DELETE/TRUNCATE COMMAND)
  - 
https://community.teradata.com/t5/Database/Delete-truncate-command/td-p/434


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-18 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168969792
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -102,7 +102,12 @@ object JdbcUtils extends Logging {
 val dialect = JdbcDialects.get(options.url)
 val statement = conn.createStatement
 try {
-  statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  if (options.isCascadeTruncate.isDefined) {
+statement.executeUpdate(dialect.getTruncateQuery(options.table,
+  options.isCascadeTruncate))
+  } else {
+statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  }
--- End diff --

What about the following?
```scala
val truncateQuery = if (options.isCascadeTruncate.isDefined) {
  dialect.getTruncateQuery(options.table, options.isCascadeTruncate))
} else {
  dialect.getTruncateQuery(options.table)
}
statement.executeUpdate(truncateQuery)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-18 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168969101
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala ---
@@ -85,15 +85,24 @@ private object PostgresDialect extends JdbcDialect {
 s"SELECT 1 FROM $table LIMIT 1"
   }
 
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
   /**
-  * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
-  * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
-  * the Postgres dialect adds 'ONLY' to truncate only the table in question
-  * @param table The name of the table.
-  * @return The SQL query to use for truncating a table
-  */
-  override def getTruncateQuery(table: String): String = {
-s"TRUNCATE TABLE ONLY $table"
+   * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
+   * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
+   * the Postgres dialect adds 'ONLY' to truncate only the table in 
question
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the value of
+   *isCascadingTruncateTable()
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+cascade match {
+  case Some(true) => s"TRUNCATE TABLE ONLY $table CASCADE"
--- End diff --

Yep. Thank you for testing. Could you add more description about this into 
the parameter description of `cascade`?
```
   * @param cascade Whether or not to cascade the truncation. Default value 
is the value of
   *isCascadingTruncateTable()
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-18 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168943159
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala ---
@@ -85,15 +85,24 @@ private object PostgresDialect extends JdbcDialect {
 s"SELECT 1 FROM $table LIMIT 1"
   }
 
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
   /**
-  * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
-  * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
-  * the Postgres dialect adds 'ONLY' to truncate only the table in question
-  * @param table The name of the table.
-  * @return The SQL query to use for truncating a table
-  */
-  override def getTruncateQuery(table: String): String = {
-s"TRUNCATE TABLE ONLY $table"
+   * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
+   * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
+   * the Postgres dialect adds 'ONLY' to truncate only the table in 
question
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the value of
+   *isCascadingTruncateTable()
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+cascade match {
+  case Some(true) => s"TRUNCATE TABLE ONLY $table CASCADE"
--- End diff --

Sure, I made a quick example, as you can see using `TRUNCATE TABLE ONLY 
$table CASCADE` will truncate the foreign key-ed table, but leave the 
inheritance relationship intact:
```postgres=# CREATE SCHEMA test;
CREATE SCHEMA
postgres=# CREATE TABLE parent(a INT);
CREATE TABLE
postgres=# ALTER TABLE parent ADD CONSTRAINT some_constraint PRIMARY KEY(a);
ALTER TABLE
postgres=# CREATE TABLE child(b INT) INHERITS (parent);
CREATE TABLE
postgres=# CREATE TABLE forkey(c INT);
CREATE TABLE
postgres=# ALTER TABLE forkey ADD FOREIGN KEY(c) REFERENCES parent(a);
ALTER TABLE

postgres=# INSERT INTO parent VALUES(1);
INSERT 0 1
postgres=# select * from parent;
 a
---
 1
(1 row)

postgres=# select * from child;
 a | b
---+---
(0 rows)

postgres=# INSERT INTO child VALUES(2);
INSERT 0 1
postgres=# select * from child;
 a | b
---+---
 2 |
(1 row)

postgres=# select * from parent;
 a
---
 1
 2
(2 rows)

postgres=# INSERT INTO forkey VALUES(1);
INSERT 0 1
postgres=# select * from forkey;
 c
---
 1
(1 row)

postgres=# TRUNCATE TABLE ONLY parent CASCADE;
NOTICE:  truncate cascades to table "forkey"
TRUNCATE TABLE
postgres=# select * from parent;
 a
---
 2
(1 row)

postgres=# select * from child;
 a | b
---+---
 2 |
(1 row)

postgres=# select * from forkey;
 c
---
(0 rows)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168930195
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala ---
@@ -85,15 +85,24 @@ private object PostgresDialect extends JdbcDialect {
 s"SELECT 1 FROM $table LIMIT 1"
   }
 
+  override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
   /**
-  * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
-  * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
-  * the Postgres dialect adds 'ONLY' to truncate only the table in question
-  * @param table The name of the table.
-  * @return The SQL query to use for truncating a table
-  */
-  override def getTruncateQuery(table: String): String = {
-s"TRUNCATE TABLE ONLY $table"
+   * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
+   * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
+   * the Postgres dialect adds 'ONLY' to truncate only the table in 
question
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the value of
+   *isCascadingTruncateTable()
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
+  table: String,
+  cascade: Option[Boolean] = isCascadingTruncateTable): String = {
+cascade match {
+  case Some(true) => s"TRUNCATE TABLE ONLY $table CASCADE"
--- End diff --

I believe you already these this in PostgreSQL.
To make it sure for us about using these `ONLY` and `CASCADE` at the same 
time, could you share us the result of a manual test of this with PostgreSQL 
tables, descendant, and foreign-key referenced tables here?




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168929874
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1372,6 +1372,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with care. This option applies only to writing. 
It defaults to the default cascading truncate behaviour of the JDBC database in 
question, specified in the isCascadeTruncate in each JDBCDialect.
--- End diff --

Yeah, so the "problem" there is that PostgreSQL has inheritance across 
tables (as we discovered in 
[SPARK-22729](https://github.com/apache/spark/pull/19911). To be completely 
transparent to users, I think the `ONLY` part of the query for PostgreSQL 
should also be configurable, but I think that's outside the scope of this PR. 
I've added a comment saying that for PostgreSQL a `ONLY` is added to prevent 
descendant tables from being truncated.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168929798
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala ---
@@ -94,5 +94,21 @@ private case object OracleDialect extends JdbcDialect {
 case _ => value
   }
 
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable()
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
--- End diff --

Done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168929792
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala ---
@@ -42,4 +42,17 @@ private object MsSqlServerDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for MsSql 
as it is unsupported.
--- End diff --

Hmm, I disagree with this, the reason I added it was to show users that 
even if they specified `cascadeTruncate` being `true`, it doesn't take effect 
for some databases (like MsSQL). The `isCascadingTruncateTable` only shows the 
default behaviour, which for all databases at the moment is false. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168928044
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -120,11 +121,13 @@ abstract class JdbcDialect extends Serializable {
* The SQL query that should be used to truncate a table. Dialects can 
override this method to
* return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other 
tables.
-   * @param table The name of the table.
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation
* @return The SQL query to use for truncating a table
*/
   @Since("2.3.0")
-  def getTruncateQuery(table: String): String = {
+  def getTruncateQuery(table: String,
+   cascade: Option[Boolean] = 
isCascadingTruncateTable): String = {
--- End diff --

For this one, we need to keep the original one for backward compatibility 
like the following?
```
  @Since("2.3.0")
  def getTruncateQuery(table: String): String = {
  getTruncateQuery(table, isCascadingTruncateTable)
  }

  @Since("2.4.0")
  def getTruncateQuery(table: String, cascade: Option[Boolean] = 
isCascadingTruncateTable)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927868
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1372,6 +1372,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with care. This option applies only to writing. 
It defaults to the default cascading truncate behaviour of the JDBC database in 
question, specified in the isCascadeTruncate in each JDBCDialect.
--- End diff --

Ur, for PostgreSQL, we are generating `TRUNCATE TABLE ONLY $table CASCADE`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927785
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/AggregatedDialect.scala ---
@@ -64,7 +64,16 @@ private class AggregatedDialect(dialects: 
List[JdbcDialect]) extends JdbcDialect
 }
   }
 
-  override def getTruncateQuery(table: String): String = {
-dialects.head.getTruncateQuery(table)
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
--- End diff --

?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927699
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala ---
@@ -94,5 +94,21 @@ private case object OracleDialect extends JdbcDialect {
 case _ => value
   }
 
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable()
+   * @return The SQL query to use for truncating a table
+   */
+  override def getTruncateQuery(
--- End diff --

Could you move this function after `isCascadingTruncateTable` like the 
other files?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927651
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala ---
@@ -94,5 +94,21 @@ private case object OracleDialect extends JdbcDialect {
 case _ => value
   }
 
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927631
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala ---
@@ -42,4 +42,17 @@ private object MsSqlServerDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for MsSql 
as it is unsupported.
--- End diff --

And, for all dialects, please remove it or move it consistently.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927633
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
@@ -46,4 +46,17 @@ private case object MySQLDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927608
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala ---
@@ -42,4 +42,17 @@ private object MsSqlServerDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
+   * @param cascade Whether or not to cascade the truncation. Default 
value is the
+   *value of isCascadingTruncateTable(). Ignored for MsSql 
as it is unsupported.
--- End diff --

I think we can ignore `Ignored for MsSql as it is unsupported.` here.
If you want, you can make a function comment of `override def 
isCascadingTruncateTable(): Option[Boolean] = Some(false)`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927597
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala ---
@@ -42,4 +42,17 @@ private object MsSqlServerDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
--- End diff --

`The JDBCOptions`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927546
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/DB2Dialect.scala ---
@@ -49,4 +49,17 @@ private object DB2Dialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+   * The SQL query used to truncate a table.
+   * @param table The JDBCOptions.
--- End diff --

`The JDBCOptions`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927495
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -120,11 +121,13 @@ abstract class JdbcDialect extends Serializable {
* The SQL query that should be used to truncate a table. Dialects can 
override this method to
* return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other 
tables.
-   * @param table The name of the table.
+   * @param table The table to truncate
+   * @param cascade Whether or not to cascade the truncation
* @return The SQL query to use for truncating a table
*/
   @Since("2.3.0")
-  def getTruncateQuery(table: String): String = {
+  def getTruncateQuery(table: String,
+   cascade: Option[Boolean] = 
isCascadingTruncateTable): String = {
--- End diff --

@danielvdende .
Could you take a look the guide? This guide is helpful in Apache Spark 
community.
- https://github.com/databricks/scala-style-guide


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927408
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/DerbyDialect.scala ---
@@ -41,4 +41,16 @@ private object DerbyDialect extends JdbcDialect {
   Option(JdbcType("DECIMAL(31,5)", java.sql.Types.DECIMAL))
 case _ => None
   }
+
+  /**
+  * The SQL query used to truncate a table.
+  * @param table The JDBCOptions.
+  * @param cascade Whether or not to cascade the truncation. Default value 
is the
+  *value of isCascadingTruncateTable(). Ignored for Derby 
as it is unsupported.
+  * @return The SQL query to use for truncating a table
+  */
+  override def getTruncateQuery(table: String,
+cascade: Option[Boolean] = 
isCascadingTruncateTable): String = {
--- End diff --

indentation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927400
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -102,7 +102,12 @@ object JdbcUtils extends Logging {
 val dialect = JdbcDialects.get(options.url)
 val statement = conn.createStatement
 try {
-  statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  if (options.isCascadeTruncate.isDefined) {
+statement.executeUpdate(dialect.getTruncateQuery(options.table,
+  options.isCascadeTruncate))
+  } else {
+statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  }
--- End diff --

I see.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168927406
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/DerbyDialect.scala ---
@@ -41,4 +41,16 @@ private object DerbyDialect extends JdbcDialect {
   Option(JdbcType("DECIMAL(31,5)", java.sql.Types.DECIMAL))
 case _ => None
   }
+
+  /**
+  * The SQL query used to truncate a table.
--- End diff --

Indentation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168922010
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -860,14 +860,41 @@ class JDBCSuite extends SparkFunSuite
 val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
 val h2 = JdbcDialects.get(url)
 val derby = JdbcDialects.get("jdbc:derby:db")
+val oracle = JdbcDialects.get("jdbc:oracle://127.0.0.1/db")
+
 val table = "weblogs"
 val defaultQuery = s"TRUNCATE TABLE $table"
 val postgresQuery = s"TRUNCATE TABLE ONLY $table"
+
 assert(MySQL.getTruncateQuery(table) == defaultQuery)
 assert(Postgres.getTruncateQuery(table) == postgresQuery)
 assert(db2.getTruncateQuery(table) == defaultQuery)
 assert(h2.getTruncateQuery(table) == defaultQuery)
 assert(derby.getTruncateQuery(table) == defaultQuery)
+assert(oracle.getTruncateQuery(table) == defaultQuery)
+  }
+
+  test("SPARK-22880: Truncate table with CASCADE by jdbc dialect") {
+// cascade in a truncate should only be applied for databases that 
support this,
+// even if the parameter is passed.
+val MySQL = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
+val Postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
+val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
+val h2 = JdbcDialects.get(url)
+val derby = JdbcDialects.get("jdbc:derby:db")
+val oracle = JdbcDialects.get("jdbc:oracle://127.0.0.1/db")
--- End diff --

Done ð 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168922006
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -860,14 +860,41 @@ class JDBCSuite extends SparkFunSuite
 val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
 val h2 = JdbcDialects.get(url)
 val derby = JdbcDialects.get("jdbc:derby:db")
+val oracle = JdbcDialects.get("jdbc:oracle://127.0.0.1/db")
+
 val table = "weblogs"
 val defaultQuery = s"TRUNCATE TABLE $table"
 val postgresQuery = s"TRUNCATE TABLE ONLY $table"
+
 assert(MySQL.getTruncateQuery(table) == defaultQuery)
 assert(Postgres.getTruncateQuery(table) == postgresQuery)
 assert(db2.getTruncateQuery(table) == defaultQuery)
 assert(h2.getTruncateQuery(table) == defaultQuery)
 assert(derby.getTruncateQuery(table) == defaultQuery)
+assert(oracle.getTruncateQuery(table) == defaultQuery)
+  }
+
+  test("SPARK-22880: Truncate table with CASCADE by jdbc dialect") {
+// cascade in a truncate should only be applied for databases that 
support this,
+// even if the parameter is passed.
+val MySQL = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
--- End diff --

Good point, I saw a few other tests in there (untouched by the PR 
initially) that had this, fixed those now too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168921851
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1372,6 +1372,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with care. This option applies only to writing.
--- End diff --

As mentioned in another comment, I think we should use the value of 
`isCascadingTruncateTable` as the default, rather than always `false`. Seems 
like the correct use of that variable. I can add a sentence to the docs 
specifying that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168921833
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -102,7 +102,12 @@ object JdbcUtils extends Logging {
 val dialect = JdbcDialects.get(options.url)
 val statement = conn.createStatement
 try {
-  statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  if (options.isCascadeTruncate.isDefined) {
+statement.executeUpdate(dialect.getTruncateQuery(options.table,
+  options.isCascadeTruncate))
+  } else {
+statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  }
--- End diff --

see the previous comment on why this is an `Option[Boolean]` 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-17 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168921808
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -119,6 +119,8 @@ class JDBCOptions(
   // 
   // if to truncate the table from the JDBC database
   val isTruncate = parameters.getOrElse(JDBC_TRUNCATE, "false").toBoolean
+
+  val isCascadeTruncate: Option[Boolean] = 
parameters.get(JDBC_CASCADE_TRUNCATE).map(_.toBoolean)
--- End diff --

The reason I didn't do that is because of the existence of the 
`isCascadingTruncateTable` function for each dialect. According to the docs, 
that indicates whether or not a `TRUNCATE TABLE` command results in cascading 
behaviour by default for a given dialect. I thought it would be nice to then 
use that value as the default value for `isCascadeTruncate`. In that way, if 
there ever is a dialect that cascades truncations by default, we don't 
'hardcode' a default value.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168885013
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -860,14 +860,41 @@ class JDBCSuite extends SparkFunSuite
 val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
 val h2 = JdbcDialects.get(url)
 val derby = JdbcDialects.get("jdbc:derby:db")
+val oracle = JdbcDialects.get("jdbc:oracle://127.0.0.1/db")
+
 val table = "weblogs"
 val defaultQuery = s"TRUNCATE TABLE $table"
 val postgresQuery = s"TRUNCATE TABLE ONLY $table"
+
 assert(MySQL.getTruncateQuery(table) == defaultQuery)
 assert(Postgres.getTruncateQuery(table) == postgresQuery)
 assert(db2.getTruncateQuery(table) == defaultQuery)
 assert(h2.getTruncateQuery(table) == defaultQuery)
 assert(derby.getTruncateQuery(table) == defaultQuery)
+assert(oracle.getTruncateQuery(table) == defaultQuery)
+  }
+
+  test("SPARK-22880: Truncate table with CASCADE by jdbc dialect") {
+// cascade in a truncate should only be applied for databases that 
support this,
+// even if the parameter is passed.
+val MySQL = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
+val Postgres = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
+val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
+val h2 = JdbcDialects.get(url)
+val derby = JdbcDialects.get("jdbc:derby:db")
+val oracle = JdbcDialects.get("jdbc:oracle://127.0.0.1/db")
--- End diff --

Could you add `teradata` test case, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168884880
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -860,14 +860,41 @@ class JDBCSuite extends SparkFunSuite
 val db2 = JdbcDialects.get("jdbc:db2://127.0.0.1/db")
 val h2 = JdbcDialects.get(url)
 val derby = JdbcDialects.get("jdbc:derby:db")
+val oracle = JdbcDialects.get("jdbc:oracle://127.0.0.1/db")
+
 val table = "weblogs"
 val defaultQuery = s"TRUNCATE TABLE $table"
 val postgresQuery = s"TRUNCATE TABLE ONLY $table"
+
 assert(MySQL.getTruncateQuery(table) == defaultQuery)
 assert(Postgres.getTruncateQuery(table) == postgresQuery)
 assert(db2.getTruncateQuery(table) == defaultQuery)
 assert(h2.getTruncateQuery(table) == defaultQuery)
 assert(derby.getTruncateQuery(table) == defaultQuery)
+assert(oracle.getTruncateQuery(table) == defaultQuery)
+  }
+
+  test("SPARK-22880: Truncate table with CASCADE by jdbc dialect") {
+// cascade in a truncate should only be applied for databases that 
support this,
+// even if the parameter is passed.
+val MySQL = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
--- End diff --

Since this is variable, what about 
- `MySQL` -> `mysql`
- `Postgres` -> `postgres`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883676
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala ---
@@ -31,4 +31,16 @@ private case object TeradataDialect extends JdbcDialect {
 case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
 case _ => None
   }
+
+  /**
+  * The SQL query used to truncate a table.
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883594
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala ---
@@ -94,5 +94,20 @@ private case object OracleDialect extends JdbcDialect {
 case _ => value
   }
 
+  /**
+  * The SQL query used to truncate a table.
+  * @param table The JDBCOptions.
+  * @param cascade Whether or not to cascade the truncation. Default value 
is the
+*value of isCascadingTruncateTable()
+  * @return The SQL query to use for truncating a table
+  */
+  override def getTruncateQuery(table: String,
+cascade: Option[Boolean] = 
isCascadingTruncateTable): String = {
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883612
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala ---
@@ -89,11 +89,17 @@ private object PostgresDialect extends JdbcDialect {
   * The SQL query used to truncate a table. For Postgres, the default 
behaviour is to
   * also truncate any descendant tables. As this is a (possibly unwanted) 
side-effect,
   * the Postgres dialect adds 'ONLY' to truncate only the table in question
-  * @param table The name of the table.
+  * @param table The table to truncate
+  * @param cascade Whether or not to cascade the truncation. Default value 
is the value of
+*isCascadingTruncateTable()
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883542
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
@@ -46,4 +46,16 @@ private case object MySQLDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+  * The SQL query used to truncate a table.
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883570
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala ---
@@ -94,5 +94,20 @@ private case object OracleDialect extends JdbcDialect {
 case _ => value
   }
 
+  /**
+  * The SQL query used to truncate a table.
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883553
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
@@ -46,4 +46,16 @@ private case object MySQLDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+  * The SQL query used to truncate a table.
+  * @param table The JDBCOptions.
+  * @param cascade Whether or not to cascade the truncation. Default value 
is the
+  *value of isCascadingTruncateTable(). Ignored for MySQL 
as it is unsupported.
+  * @return The SQL query to use for truncating a table
+  */
+  override def getTruncateQuery(table: String,
+cascade: Option[Boolean] = 
isCascadingTruncateTable): String = {
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883530
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala ---
@@ -42,4 +42,16 @@ private object MsSqlServerDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+  * The SQL query used to truncate a table.
+  * @param table The JDBCOptions.
+  * @param cascade Whether or not to cascade the truncation. Default value 
is the
+  *value of isCascadingTruncateTable(). Ignored for MsSql 
as it is unsupported.
+  * @return The SQL query to use for truncating a table
+  */
+  override def getTruncateQuery(table: String,
+cascade: Option[Boolean] = 
isCascadingTruncateTable): String = {
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883267
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/DB2Dialect.scala ---
@@ -49,4 +49,16 @@ private object DB2Dialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+  * The SQL query used to truncate a table.
+  * @param table The JDBCOptions.
+  * @param cascade Whether or not to cascade the truncation. Default value 
is the
+  *value of isCascadingTruncateTable(). Ignored for DB2 as 
it is unsupported.
+  * @return The SQL query to use for truncating a table
+  */
+  override def getTruncateQuery(table: String,
+cascade: Option[Boolean] = 
isCascadingTruncateTable): String = {
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883425
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala ---
@@ -42,4 +42,16 @@ private object MsSqlServerDialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+  * The SQL query used to truncate a table.
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883216
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/DB2Dialect.scala ---
@@ -49,4 +49,16 @@ private object DB2Dialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+  * The SQL query used to truncate a table.
--- End diff --

Indentation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883182
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/AggregatedDialect.scala ---
@@ -64,7 +64,15 @@ private class AggregatedDialect(dialects: 
List[JdbcDialect]) extends JdbcDialect
 }
   }
 
-  override def getTruncateQuery(table: String): String = {
-dialects.head.getTruncateQuery(table)
+  /**
+  * The SQL query used to truncate a table.
+  * @param table The JDBCOptions.
+  * @param cascade Whether or not to cascade the truncation. Default value 
is the
+  *value of isCascadingTruncateTable()
+  * @return The SQL query to use for truncating a table
+  */
+  override def getTruncateQuery(table: String,
+cascade: Option[Boolean] = 
isCascadingTruncateTable): String = {
--- End diff --

Indentation?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168883141
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/AggregatedDialect.scala ---
@@ -64,7 +64,15 @@ private class AggregatedDialect(dialects: 
List[JdbcDialect]) extends JdbcDialect
 }
   }
 
-  override def getTruncateQuery(table: String): String = {
-dialects.head.getTruncateQuery(table)
+  /**
+  * The SQL query used to truncate a table.
--- End diff --

Indentation?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168882930
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -102,7 +102,12 @@ object JdbcUtils extends Logging {
 val dialect = JdbcDialects.get(options.url)
 val statement = conn.createStatement
 try {
-  statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  if (options.isCascadeTruncate.isDefined) {
+statement.executeUpdate(dialect.getTruncateQuery(options.table,
+  options.isCascadeTruncate))
+  } else {
+statement.executeUpdate(dialect.getTruncateQuery(options.table))
+  }
--- End diff --

Simply, one line if we are using `boolean` instead of `Option`?
```
statement.executeUpdate(dialect.getTruncateQuery(options.table, 
options.isCascadeTruncate))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168882599
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -119,6 +119,8 @@ class JDBCOptions(
   // 
   // if to truncate the table from the JDBC database
   val isTruncate = parameters.getOrElse(JDBC_TRUNCATE, "false").toBoolean
+
+  val isCascadeTruncate: Option[Boolean] = 
parameters.get(JDBC_CASCADE_TRUNCATE).map(_.toBoolean)
--- End diff --

Sorry. I'm not sure that I lost the previous context, but what about 
`getOrElse` like `val isTruncate` on line 121?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r168882085
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1372,6 +1372,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with care. This option applies only to writing.
--- End diff --

Maybe, do we need a default value description like `It defaults to 
false`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-03 Thread Stephan202

Github user Stephan202 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r165814532
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/DB2Dialect.scala ---
@@ -49,4 +49,16 @@ private object DB2Dialect extends JdbcDialect {
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = Some(false)
+
+  /**
+  * The SQL query used to truncate a table.
+  * @param table The JDBCOptions.
+  * @param cascade Whether or not to cascade the truncation. Default value 
is the
+  *value of isCascadingTruncateTable()
--- End diff --

For dialects not supporting cascading, should the documentation perhaps 
indicate that the `cascade` parameter is ignored?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-02-03 Thread Stephan202

Github user Stephan202 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r165814475
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1339,6 +1339,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with case. This option applies only to writing.
--- End diff --

About the documentation: `s/used with case/used with care/`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-01-02 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r159371171
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1339,6 +1339,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with case. This option applies only to writing.
--- End diff --

@dongjoon-hyun one thing I'm thinking of (just curious to hear your 
opinion): could we use the value of `isCascadingTruncateTable` for each dialect 
as the default value for the `cascade` boolean? In that way, there is only a 
single boolean per dialect specifying what the default behaviour is with regard 
to cascading during truncates.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-01-02 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r159370871
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1339,6 +1339,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with case. This option applies only to writing.
--- End diff --

@dongjoon-hyun apologies if I misrepresented your comments/opinions from 
the previous PR, wasn't my intention :-).

I've given it some more thought, and I can see you point about the default 
value. I'll make the change we discussed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-01-02 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r159262664
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1339,6 +1339,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with case. This option applies only to writing.
--- End diff --

Hi, @danielvdende .
I think you can ignore my comment in previous PR. There were many 
directional comments on that PR and it's not the final one. Your previous PR is 
merged by @gatorsmile .

For me, I also still don't agree on the default value inside 
`JdbcDialects.scala` in this PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-01-02 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r159183234
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1339,6 +1339,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with case. This option applies only to writing.
--- End diff --

I think it raises the question of how complete/incomplete the Spark JDBC 
API should be, and what the use cases are that it should serve. For the most 
simple cases, in which no key constraints are set between tables, you won't 
need this option. However, as soon as foreign key constraints are introduced, 
it is very important. I agree that not every functionality from SQL (dialects) 
should be included, but I personally feel this is quite fundamental 
functionality. 

Moreover, as it's configuration option users that don't want it also don't 
have to use it. I think we also discussed this functionality in previous PR 
with @dongjoon-hyun  here: 
[SPARK-22729](https://github.com/apache/spark/pull/19911)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2017-12-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r159118000
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1339,6 +1339,13 @@ the following case-insensitive options:
  This is a JDBC writer related option. When 
SaveMode.Overwrite is enabled, this option causes Spark to 
truncate an existing table instead of dropping and recreating it. This can be 
more efficient, and prevents the table metadata (e.g., indices) from being 
removed. However, it will not work in some cases, such as when the new data has 
a different schema. It defaults to false. This option applies only 
to writing.

   
+  
+  
+cascadeTruncate
+
+This is a JDBC writer related option. If enabled and supported by 
the JDBC database (PostgreSQL and Oracle at the moment), this options allows 
execution of a TRUNCATE TABLE t CASCADE. This will affect other 
tables, and thus should be used with case. This option applies only to writing.
--- End diff --

This conf only takes an effect when `SaveMode.Overwrite` is enabled and 
`truncate` is set to true. My doubt is the usage scenario of this extra conf 
does not benefit most users. Could we hold this PR? When the other users have 
similar requests, we can revisit it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2017-12-22 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r158557222
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -120,11 +121,12 @@ abstract class JdbcDialect extends Serializable {
* The SQL query that should be used to truncate a table. Dialects can 
override this method to
* return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other 
tables.
-   * @param table The name of the table.
+   * @param table The table to truncate
+   * @param cascade (OPTIONAL) Whether or not to cascade the truncation. 
Default: false
* @return The SQL query to use for truncating a table
*/
   @Since("2.3.0")
-  def getTruncateQuery(table: String): String = {
+  def getTruncateQuery(table: String, cascade: Boolean = false): String = {
--- End diff --

I think the default for all dialects should be false, regardless of whether 
the cascade feature is even supported. And for those for which it is supported, 
it should default to false. It's a feature that you should only use if you 
explicitly want to cascade imho. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2017-12-22 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r158556753
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -120,11 +121,12 @@ abstract class JdbcDialect extends Serializable {
* The SQL query that should be used to truncate a table. Dialects can 
override this method to
* return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other 
tables.
-   * @param table The name of the table.
+   * @param table The table to truncate
+   * @param cascade (OPTIONAL) Whether or not to cascade the truncation. 
Default: false
* @return The SQL query to use for truncating a table
*/
   @Since("2.3.0")
-  def getTruncateQuery(table: String): String = {
+  def getTruncateQuery(table: String, cascade: Boolean = false): String = {
--- End diff --

That is the main reason I request the change. I don't think this 
implementation is correct on for all dialects. Anyway, it's only my opinion.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2017-12-22 Thread danielvdende

Github user danielvdende commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r158555400
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -120,11 +121,12 @@ abstract class JdbcDialect extends Serializable {
* The SQL query that should be used to truncate a table. Dialects can 
override this method to
* return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other 
tables.
-   * @param table The name of the table.
+   * @param table The table to truncate
+   * @param cascade (OPTIONAL) Whether or not to cascade the truncation. 
Default: false
* @return The SQL query to use for truncating a table
*/
   @Since("2.3.0")
-  def getTruncateQuery(table: String): String = {
+  def getTruncateQuery(table: String, cascade: Boolean = false): String = {
--- End diff --

That would force all dialects to implement this method though, which would 
lead to unnecessary code duplication. Moreover, removing the default value here 
would change the public API, as it would force others who have written custom 
dialects to change calls to this method. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2017-12-22 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20057#discussion_r158549591
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -120,11 +121,12 @@ abstract class JdbcDialect extends Serializable {
* The SQL query that should be used to truncate a table. Dialects can 
override this method to
* return a query that is suitable for a particular database. For 
PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other 
tables.
-   * @param table The name of the table.
+   * @param table The table to truncate
+   * @param cascade (OPTIONAL) Whether or not to cascade the truncation. 
Default: false
* @return The SQL query to use for truncating a table
*/
   @Since("2.3.0")
-  def getTruncateQuery(table: String): String = {
+  def getTruncateQuery(table: String, cascade: Boolean = false): String = {
--- End diff --

Could you remove the default value here?
It could be safe to prevent accidental inheritance of this wrong 
implementation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2017-12-22 Thread danielvdende

GitHub user danielvdende opened a pull request:

https://github.com/apache/spark/pull/20057

[SPARK-22880][SQL] Add cascadeTruncate option to JDBC datasource

This commit adds the `cascadeTruncate` option to the JDBC datasource
API, for databases that support this functionality (PostgreSQL and
Oracle at the moment). This allows for applying a cascading truncate
that affects tables that have foreign key constraints on the table
being truncated.

## What changes were proposed in this pull request?

Add `cascadeTruncate` option to JDBC datasource API. Allow this to affect 
the 
`TRUNCATE` query for databases that support this option.

## How was this patch tested?
Existing tests for `truncateQuery` were updated. Also, an additional test 
was added
to ensure that the correct syntax was applied, and that enabling the config 
for databases
that do not support this option does not result in invalid queries.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/danielvdende/spark SPARK-22880

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20057.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20057


commit 842eca7fa282a81f0f4d459242869c022ef21080
Author: Daniel van der Ende 
Date:   2017-12-22T15:58:28Z

[SPARK-22880][SQL] Add cascadeTruncate option to JDBC datasource

This commit adds the cascadeTruncate option to the JDBC datasource
API, for databases that support this functionality (PostgreSQL and
Oracle at the moment). This allows for applying a cascading truncate
that affects tables that have foreign key constraints on the table
being truncated.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

77 matches

Mail list logo