spark git commit: [SPARK-21512][SQL][TEST] DatasetCacheSuite needs to execute unpersistent after executing peristent

lixiao Sun, 23 Jul 2017 11:32:22 -0700

Repository: spark
Updated Branches:
  refs/heads/master a4eac8b0b -> 481f07929



[SPARK-21512][SQL][TEST] DatasetCacheSuite needs to execute unpersistent after 
executing peristent

## What changes were proposed in this pull request?

This PR avoids to reuse unpersistent dataset among test cases by making dataset 
unpersistent at the end of each test case.

In `DatasetCacheSuite`, the test case `"get storage level"` does not make 
dataset unpersisit after make the dataset persisitent. The same dataset will be 
made persistent by the test case `"persist and then rebind right encoder when 
join 2 datasets"` Thus, we run these test cases, the second case does not 
perform to make dataset persistent. This is because in

When we run only the second case, it performs to make dataset persistent. It is 
not good to change behavior of the second test suite. The first test case 
should correctly make dataset unpersistent.

```
Testing started at 17:52 ...
01:52:15.053 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
01:52:48.595 WARN org.apache.spark.sql.execution.CacheManager: Asked to cache 
already cached data.
01:52:48.692 WARN org.apache.spark.sql.execution.CacheManager: Asked to cache 
already cached data.
01:52:50.864 WARN org.apache.spark.storage.RandomBlockReplicationPolicy: 
Expecting 1 replicas with only 0 peer/s.
01:52:50.864 WARN org.apache.spark.storage.RandomBlockReplicationPolicy: 
Expecting 1 replicas with only 0 peer/s.
01:52:50.868 WARN org.apache.spark.storage.BlockManager: Block rdd_8_1 
replicated to only 0 peer(s) instead of 1 peers
01:52:50.868 WARN org.apache.spark.storage.BlockManager: Block rdd_8_0 
replicated to only 0 peer(s) instead of 1 peers
```

After this PR, these messages do not appear
```
Testing started at 18:14 ...
02:15:05.329 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable

Process finished with exit code 0
```

## How was this patch tested?

Used the existing test

Author: Kazuaki Ishizaki <[email protected]>

Closes #18719 from kiszk/SPARK-21512.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/481f0792
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/481f0792
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/481f0792

Branch: refs/heads/master
Commit: 481f0792944d9a77f0fe8b5e2596da1d600b9d0a
Parents: a4eac8b
Author: Kazuaki Ishizaki <[email protected]>
Authored: Sun Jul 23 11:31:27 2017 -0700
Committer: gatorsmile <[email protected]>
Committed: Sun Jul 23 11:31:27 2017 -0700

----------------------------------------------------------------------
 .../src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala | 5 +++++
 1 file changed, 5 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/481f0792/sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala
index e0561ee..2dc6b44 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala
@@ -25,6 +25,11 @@ import org.apache.spark.storage.StorageLevel
 class DatasetCacheSuite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
+  // Clear all persistent datasets after each test
+  override def afterEach(): Unit = {
+    spark.sharedState.cacheManager.clearCache()
+  }
+
   test("get storage level") {
     val ds1 = Seq("1", "2").toDS().as("a")
     val ds2 = Seq(2, 3).toDS().as("b")


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

spark git commit: [SPARK-21512][SQL][TEST] DatasetCacheSuite needs to execute unpersistent after executing peristent

Reply via email to