HyukjinKwon commented on a change in pull request #25262: 
[SPARK-28486][CORE][PYTHON] Map PythonBroadcast's data file to a BroadcastBlock 
to avoid delete by GC
URL: https://github.com/apache/spark/pull/25262#discussion_r309004503
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
 ##########
 @@ -717,33 +720,36 @@ private[spark] class PythonBroadcast(@transient var 
path: String) extends Serial
   }
 
   /**
-   * Write data into disk, using randomly generated name.
+   * Write data into disk and map it to a broadcast block.
    */
-  private def readObject(in: ObjectInputStream): Unit = Utils.tryOrIOException 
{
-    val dir = new File(Utils.getLocalDir(SparkEnv.get.conf))
-    val file = File.createTempFile("broadcast", "", dir)
-    path = file.getAbsolutePath
-    val out = new FileOutputStream(file)
-    Utils.tryWithSafeFinally {
-      Utils.copyStream(in, out)
-    } {
-      out.close()
-    }
-  }
-
-  /**
-   * Delete the file once the object is GCed.
-   */
-  override def finalize() {
-    if (!path.isEmpty) {
-      val file = new File(path)
-      if (file.exists()) {
-        if (!file.delete()) {
 
 Review comment:
   Ah I missed this part. I now I get why it was specific to Python's ..

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to