yuexing commented on code in PR #51028:
URL: https://github.com/apache/spark/pull/51028#discussion_r2115714051


##########
common/utils/src/main/scala/org/apache/spark/util/ClosureCleaner.scala:
##########
@@ -418,22 +418,26 @@ private[spark] object ClosureCleaner extends Logging {
     logDebug(s" + fields accessed by starting closure: ${accessedFields.size} 
classes")
     accessedFields.foreach { f => logDebug("     " + f) }
 
+    // Instead of cloning and trying to update lambda reference, directly 
modify the original
+    // REPL object to null out unused fields. This works because REPL objects 
are not hidden
+    // classes and their fields can be modified.
     if (accessedFields(capturingClass).size < 
capturingClass.getDeclaredFields.length) {
-      // clone and clean the enclosing `this` only when there are fields to 
null out
-      logDebug(s" + cloning instance of REPL class ${capturingClass.getName}")
-      val clonedOuterThis = cloneAndSetFields(
-        parent = null, outerThis, capturingClass, accessedFields)
-
-      val outerField = func.getClass.getDeclaredField("arg$1")
-      // SPARK-37072: When Java 17 is used and `outerField` is read-only,
-      // the content of `outerField` cannot be set by reflect api directly.
-      // But we can remove the `final` modifier of `outerField` before set 
value
-      // and reset the modifier after set value.
-      setFieldAndIgnoreModifiers(func, outerField, clonedOuterThis)
+      val fieldsToNull = capturingClass.getDeclaredFields.filterNot { field =>

Review Comment:
   @LuciferYang @srowen @cloud-fan  So do I. I tried almost everything I can 
find to clone and modify, but all failed in new java version. That's why I 
started to think about save-modify-restore outerThis. 
   As to the gap, clean-serialize-restore are logically close unless in 
multithread:
   ```
   SparkClosureCleaner::clean  // simplified verion
       if (ClosureCleaner.clean(closure, cleanTransitively, mutable.Map.empty)) 
{
       try {
             SparkEnv.get.closureSerializer.newInstance().serialize(closure)
         } catch {
           case ex: Exception => throw new SparkException("Task not 
serializable", ex)
         } finally {
           ClosureCleaner.restoreFieldValues(closure)
         }
   ```
   
   any other ideas?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to