Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19476#discussion_r144585860
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -1552,4 +1582,65 @@ private[spark] object BlockManager {
override val metricRegistry = new MetricRegistry
metricRegistry.registerAll(metricSet)
}
+
+ class RemoteBlockTempFileManager(blockManager: BlockManager)
+ extends TempFileManager with Logging {
+
+ private class ReferenceWithCleanup(file: File, referenceQueue:
JReferenceQueue[File])
+ extends WeakReference[File](file, referenceQueue) {
+ private val filePath = file.getAbsolutePath
+
+ def cleanUp(): Unit = {
+ logDebug(s"Clean up file $filePath")
+
+ if (!new File(filePath).delete()) {
+ logDebug(s"Fail to delete file $filePath")
+ }
+ }
+ }
--- End diff --
Yes, it's good idea to delete the tmp files by `WeakReference`. Is it
possible to make this a default behavior of `TempFileManager`, and make it to
be `override def registerTempFileToClean(file: File): Unit`.
In https://github.com/apache/spark/pull/18565, I was struggling for file
leak, because `TempFileManager` are not responsible for deleting tmp files.
Additionally, files are deleted by the `cleaningThread`, hope the cost can
be neglected.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]