nchammas commented on code in PR #45181:
URL: https://github.com/apache/spark/pull/45181#discussion_r1497634106


##########
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala:
##########
@@ -82,6 +82,26 @@ class DatasetCacheSuite extends QueryTest
     assert(cached.storageLevel == StorageLevel.NONE, "The Dataset should not 
be cached.")
   }
 
+  test("SPARK-46992 collect before persisting") {
+    val ds = Seq(("a", 1), ("b", 2), ("c", 3)).toDS().select(expr("_2 + 
1").as[Int])
+    // collect first
+    ds.collect()
+    // and then cache it
+    val cached = ds.cache()
+    // ds is not cached
+    assertNotCached(ds)
+    // Make sure, the Dataset is indeed cached.
+    assertCached(cached)
+
+    // Check result.
+    checkDataset(
+      cached,
+      2, 3, 4)

Review Comment:
   Are you sure this is a valid test? Because this particular check passes for 
me on `master`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to