Github user jinxing64 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19476#discussion_r144585860
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
    @@ -1552,4 +1582,65 @@ private[spark] object BlockManager {
         override val metricRegistry = new MetricRegistry
         metricRegistry.registerAll(metricSet)
       }
    +
    +  class RemoteBlockTempFileManager(blockManager: BlockManager)
    +      extends TempFileManager with Logging {
    +
    +    private class ReferenceWithCleanup(file: File, referenceQueue: 
JReferenceQueue[File])
    +        extends WeakReference[File](file, referenceQueue) {
    +      private val filePath = file.getAbsolutePath
    +
    +      def cleanUp(): Unit = {
    +        logDebug(s"Clean up file $filePath")
    +
    +        if (!new File(filePath).delete()) {
    +          logDebug(s"Fail to delete file $filePath")
    +        }
    +      }
    +    }
    --- End diff --
    
    Yes, it's good idea to delete the tmp files by `WeakReference`. Is it 
possible to make this a default behavior of `TempFileManager`, and make it to 
be `override def registerTempFileToClean(file: File): Unit`.
    In https://github.com/apache/spark/pull/18565, I was struggling for file 
leak, because `TempFileManager` are not responsible for deleting tmp files.
    Additionally, files are deleted by the `cleaningThread`, hope the cost can 
be neglected.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to