hvanhovell commented on code in PR #46683:
URL: https://github.com/apache/spark/pull/46683#discussion_r1608570707
##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala:
##########
@@ -1554,6 +1555,37 @@ class ClientE2ETestSuite extends RemoteSparkSession with
SQLHelper with PrivateM
val metrics = SparkThreadUtils.awaitResult(future, 2.seconds)
assert(metrics === Map("min(id)" -> 0, "avg(id)" -> 49, "max(id)" -> 98))
}
+
+ test("checkpoint") {
+ val df = spark.range(100).localCheckpoint()
+ testCapturedStdOut(df.explain(), "ExistingRDD")
+ }
+
+ test("checkpoint gc") {
+ var df1 = spark.range(100).localCheckpoint(eager = true)
+ val encoder = df1.agnosticEncoder
+ val dfId = df1.cachedRemoteRelationID.get
+
+ // GC triggers remove the cached remote relation
+ df1 = null
+ System.gc()
+
+ // Make sure the cleanup happens in the server side.
+ Thread.sleep(3000L)
Review Comment:
I can guarantee you that this will be flaky. IMO we assume that GC and
reference queues work. The only thing we need to test if the action called upon
GC is working.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]