LuciferYang commented on code in PR #37624:
URL: https://github.com/apache/spark/pull/37624#discussion_r956560789
##########
resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceSuite.scala:
##########
@@ -770,6 +771,40 @@ abstract class YarnShuffleServiceSuite extends
SparkFunSuite with Matchers {
s1.stop()
}
+ test("SPARK-40186: shuffleMergeManager should have been shutdown before db
closed") {
+ val maxId = 100
+ s1 = createYarnShuffleService()
+ val resolver = s1.shuffleMergeManager.asInstanceOf[RemoteBlockPushResolver]
+ (0 to maxId).foreach { id =>
+ val appId = ApplicationId.newInstance(0, id)
+ val appInfo = makeAppInfo("user", appId)
+ s1.initializeApplication(appInfo)
+ val mergedShuffleInfo =
+ new ExecutorShuffleInfo(
+ Array(new File(tempDir, "foo/foo").getAbsolutePath,
+ new File(tempDir, "bar/bar").getAbsolutePath), 3,
+ SORT_MANAGER_WITH_MERGE_SHUFFLE_META_WithAttemptID1)
+ resolver.registerExecutor(appId.toString, mergedShuffleInfo)
+ }
+
+ (0 until maxId).foreach { id =>
+ val appId = ApplicationId.newInstance(0, id)
+ resolver.applicationRemoved(appId.toString, true)
+ }
+
+ s1.stop()
+
+ assert(ShuffleTestAccessor.isMergedShuffleCleanerShutdown(resolver))
+ assert(ShuffleTestAccessor.mergeManagerLevelDB(resolver) == null)
+
+ val message = intercept[RejectedExecutionException] {
Review Comment:
> It is better to keep this `final` - any particular reason to change this ?
Due to this test scene.
If keep final and not reset `db=null` after db close, then directly invoke `
resolver.applicationRemoved` will still cause `delete data from a closed db
instance`(I manual tested RocksDB. If keep final and not set `db = null`
after db closed, it will still crash.).
https://github.com/apache/spark/blob/4b0c3bab1ab082565a051990bf45774f15962deb/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java#L373-L394
As above code, both `removeAppAttemptPathInfoFromDB` and
`removeOldApplicationAttemptsFromDb` are synchronous operations. These
operations will be rejected only when db is `null` now.
Therefore, for safety, I suggest removing the final of `db` and leaving
reset `db` to null.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]