Repository: spark
Updated Branches:
  refs/heads/branch-2.0 a0c03c925 -> b959dab32


[SPARK-17986][ML] SQLTransformer should remove temporary tables

## What changes were proposed in this pull request?

A call to the method `SQLTransformer.transform` previously would create a 
temporary table and never delete it. This change adds a call to 
`dropTempView()` that deletes this temporary table before returning the result 
so that the table will not remain in spark's table catalog. Because `tableName` 
is randomized and not exposed, there should be no expected use of this table 
outside of the `transform` method.

## How was this patch tested?

A single new assertion was added to the existing test of the 
`SQLTransformer.transform` method that all temporary tables are removed. 
Without the corresponding code change, this new assertion fails. I am not aware 
of any circumstances in which removing this temporary view would be bad for 
performance or correctness in other ways, but some expertise here would be 
helpful.

Author: Drew Robb <drewr...@gmail.com>

Closes #15526 from drewrobb/SPARK-17986.

(cherry picked from commit ab3363e9f6b1f7fc26682509fe7382c570f91778)
Signed-off-by: Yanbo Liang <yblia...@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b959dab3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b959dab3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b959dab3

Branch: refs/heads/branch-2.0
Commit: b959dab32a455e0f9a9ea0fd2111e28a5faf796c
Parents: a0c03c9
Author: Drew Robb <drewr...@gmail.com>
Authored: Sat Oct 22 01:59:36 2016 -0700
Committer: Yanbo Liang <yblia...@gmail.com>
Committed: Sat Oct 22 02:00:05 2016 -0700

----------------------------------------------------------------------
 .../main/scala/org/apache/spark/ml/feature/SQLTransformer.scala  | 4 +++-
 .../scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala  | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/b959dab3/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
----------------------------------------------------------------------
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
index 259be26..b25fff9 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
@@ -67,7 +67,9 @@ class SQLTransformer @Since("1.6.0") (@Since("1.6.0") 
override val uid: String)
     val tableName = Identifiable.randomUID(uid)
     dataset.createOrReplaceTempView(tableName)
     val realStatement = $(statement).replace(tableIdentifier, tableName)
-    dataset.sparkSession.sql(realStatement)
+    val result = dataset.sparkSession.sql(realStatement)
+    dataset.sparkSession.catalog.dropTempView(tableName)
+    result
   }
 
   @Since("1.6.0")

http://git-wip-us.apache.org/repos/asf/spark/blob/b959dab3/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala
----------------------------------------------------------------------
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala
index 1401ea9..9d3c007 100644
--- a/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala
@@ -43,6 +43,7 @@ class SQLTransformerSuite
     assert(result.schema.toString == resultSchema.toString)
     assert(resultSchema == expected.schema)
     assert(result.collect().toSeq == expected.collect().toSeq)
+    assert(original.sparkSession.catalog.listTables().count() == 0)
   }
 
   test("read/write") {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to