This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
     new f3caba1  [SPARK-34607][SQL][3.1] Add `Utils.isMemberClass` to fix a 
malformed class name error on jdk8u
f3caba1 is described below

commit f3caba1f8d649889713de4824412a3ee2666bc6f
Author: Takeshi Yamamuro <yamam...@apache.org>
AuthorDate: Thu Mar 4 22:40:57 2021 -0800

    [SPARK-34607][SQL][3.1] Add `Utils.isMemberClass` to fix a malformed class 
name error on jdk8u
    
    This PR intends to fix a bug of `objects.NewInstance` if a user runs Spark 
on jdk8u and a given `cls` in `NewInstance` is a deeply-nested inner class, 
e.g.,.
    ```
      object OuterLevelWithVeryVeryVeryLongClassName1 {
        object OuterLevelWithVeryVeryVeryLongClassName2 {
          object OuterLevelWithVeryVeryVeryLongClassName3 {
            object OuterLevelWithVeryVeryVeryLongClassName4 {
              object OuterLevelWithVeryVeryVeryLongClassName5 {
                object OuterLevelWithVeryVeryVeryLongClassName6 {
                  object OuterLevelWithVeryVeryVeryLongClassName7 {
                    object OuterLevelWithVeryVeryVeryLongClassName8 {
                      object OuterLevelWithVeryVeryVeryLongClassName9 {
                        object OuterLevelWithVeryVeryVeryLongClassName10 {
                          object OuterLevelWithVeryVeryVeryLongClassName11 {
                            object OuterLevelWithVeryVeryVeryLongClassName12 {
                              object OuterLevelWithVeryVeryVeryLongClassName13 {
                                object 
OuterLevelWithVeryVeryVeryLongClassName14 {
                                  object 
OuterLevelWithVeryVeryVeryLongClassName15 {
                                    object 
OuterLevelWithVeryVeryVeryLongClassName16 {
                                      object 
OuterLevelWithVeryVeryVeryLongClassName17 {
                                        object 
OuterLevelWithVeryVeryVeryLongClassName18 {
                                          object 
OuterLevelWithVeryVeryVeryLongClassName19 {
                                            object 
OuterLevelWithVeryVeryVeryLongClassName20 {
                                              case class 
MalformedNameExample2(x: Int)
                                            }}}}}}}}}}}}}}}}}}}}
    ```
    
    The root cause that Kris (@rednaxelafx) investigated is as follows (Kudos 
to Kris);
    
    The reason why the test case above is so convoluted is in the way Scala 
generates the class name for nested classes. In general, Scala generates a 
class name for a nested class by inserting the dollar-sign ( `$` ) in between 
each level of class nesting. The problem is that this format can concatenate 
into a very long string that goes beyond certain limits, so Scala will change 
the class name format beyond certain length threshold.
    
    For the example above, we can see that the first two levels of class 
nesting have class names that look like this:
    ```
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassName1$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassName1$OuterLevelWithVeryVeryVeryLongClassName2$
    ```
    If we leave out the fact that Scala uses a dollar-sign ( `$` ) suffix for 
the class name of the companion object, 
`OuterLevelWithVeryVeryVeryLongClassName1`'s full name is a prefix (substring) 
of `OuterLevelWithVeryVeryVeryLongClassName2`.
    
    But if we keep going deeper into the levels of nesting, you'll find names 
that look like:
    ```
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$2a1321b953c615695d7442b2adb1$$$$ryVeryLongClassName8$OuterLevelWithVeryVeryVeryLongClassName9$OuterLevelWithVeryVeryVeryLongClassName10$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$2a1321b953c615695d7442b2adb1$$$$ryVeryLongClassName8$OuterLevelWithVeryVeryVeryLongClassName9$OuterLevelWithVeryVeryVeryLongClassName10$OuterLevelWithVeryVeryVeryLongClassName11$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$85f068777e7ecf112afcbe997d461b$$$$VeryLongClassName11$OuterLevelWithVeryVeryVeryLongClassName12$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$85f068777e7ecf112afcbe997d461b$$$$VeryLongClassName11$OuterLevelWithVeryVeryVeryLongClassName12$OuterLevelWithVeryVeryVeryLongClassName13$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$85f068777e7ecf112afcbe997d461b$$$$VeryLongClassName11$OuterLevelWithVeryVeryVeryLongClassName12$OuterLevelWithVeryVeryVeryLongClassName13$OuterLevelWithVeryVeryVeryLongClassName14$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$5f7ad51804cb1be53938ea804699fa$$$$VeryLongClassName14$OuterLevelWithVeryVeryVeryLongClassName15$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$5f7ad51804cb1be53938ea804699fa$$$$VeryLongClassName14$OuterLevelWithVeryVeryVeryLongClassName15$OuterLevelWithVeryVeryVeryLongClassName16$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$5f7ad51804cb1be53938ea804699fa$$$$VeryLongClassName14$OuterLevelWithVeryVeryVeryLongClassName15$OuterLevelWithVeryVeryVeryLongClassName16$OuterLevelWithVeryVeryVeryLongClassName17$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$69b54f16b1965a31e88968df1a58d8$$$$VeryLongClassName17$OuterLevelWithVeryVeryVeryLongClassName18$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$69b54f16b1965a31e88968df1a58d8$$$$VeryLongClassName17$OuterLevelWithVeryVeryVeryLongClassName18$OuterLevelWithVeryVeryVeryLongClassName19$
    
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$69b54f16b1965a31e88968df1a58d8$$$$VeryLongClassName17$OuterLevelWithVeryVeryVeryLongClassName18$OuterLevelWithVeryVeryVeryLongClassName19$OuterLevelWithVeryVeryVeryLongClassName20$
    ```
    with a hash code in the middle and various levels of nesting omitted.
    
    The `java.lang.Class.isMemberClass` method is implemented in JDK8u as:
    
http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/tip/src/share/classes/java/lang/Class.java#l1425
    ```
        /**
         * Returns {@code true} if and only if the underlying class
         * is a member class.
         *
         * @return {@code true} if and only if this class is a member class.
         * @since 1.5
         */
        public boolean isMemberClass() {
            return getSimpleBinaryName() != null && !isLocalOrAnonymousClass();
        }
    
        /**
         * Returns the "simple binary name" of the underlying class, i.e.,
         * the binary name without the leading enclosing class name.
         * Returns {@code null} if the underlying class is a top level
         * class.
         */
        private String getSimpleBinaryName() {
            Class<?> enclosingClass = getEnclosingClass();
            if (enclosingClass == null) // top level class
                return null;
            // Otherwise, strip the enclosing class' name
            try {
                return getName().substring(enclosingClass.getName().length());
            } catch (IndexOutOfBoundsException ex) {
                throw new InternalError("Malformed class name", ex);
            }
        }
    ```
    and the problematic code is 
`getName().substring(enclosingClass.getName().length())` -- if a class's 
enclosing class's full name is *longer* than the nested class's full name, this 
logic would end up going out of bounds.
    
    The bug has been fixed in JDK9 by 
https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8057919 , but still exists 
in the latest JDK8u release. So from the Spark side we'd need to do something 
to avoid hitting this problem.
    
    This is the backport of #31733.
    
    Bugfix on jdk8u.
    
    No.
    
    Added tests.
---
 .../main/scala/org/apache/spark/util/Utils.scala   | 28 +++++++++
 .../spark/sql/catalyst/encoders/OuterScopes.scala  |  2 +-
 .../sql/catalyst/expressions/objects/objects.scala |  2 +-
 .../catalyst/encoders/ExpressionEncoderSuite.scala | 70 ++++++++++++++++++++++
 4 files changed, 100 insertions(+), 2 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index bb645f5..0888cbd 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -2861,6 +2861,34 @@ private[spark] object Utils extends Logging {
   }
 
   /**
+   * Returns true if and only if the underlying class is a member class.
+   *
+   * Note: jdk8u throws a "Malformed class name" error if a given class is a 
deeply-nested
+   * inner class (See SPARK-34607 for details). This issue has already been 
fixed in jdk9+, so
+   * we can remove this helper method safely if we drop the support of jdk8u.
+   */
+  def isMemberClass(cls: Class[_]): Boolean = {
+    try {
+      cls.isMemberClass
+    } catch {
+      case _: InternalError =>
+        // We emulate jdk8u `Class.isMemberClass` below:
+        //   public boolean isMemberClass() {
+        //     return getSimpleBinaryName() != null && 
!isLocalOrAnonymousClass();
+        //   }
+        // `getSimpleBinaryName()` returns null if a given class is a 
top-level class,
+        // so we replace it with `cls.getEnclosingClass != null`. The second 
condition checks
+        // if a given class is not a local or an anonymous class, so we 
replace it with
+        // `cls.getEnclosingMethod == null` because `cls.getEnclosingMethod()` 
return a value
+        // only in either case (JVM Spec 4.8.6).
+        //
+        // Note: The newer jdk evaluates `!isLocalOrAnonymousClass()` first,
+        // we reorder the conditions to follow it.
+        cls.getEnclosingMethod == null && cls.getEnclosingClass != null
+    }
+  }
+
+  /**
    * Safer than Class obj's getSimpleName which may throw Malformed class name 
error in scala.
    * This method mimics scalatest's getSimpleNameOfAnObjectsClass.
    */
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala
index 665b2cd..2d8f028 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala
@@ -48,7 +48,7 @@ object OuterScopes {
    * useful for inner class defined in REPL.
    */
   def getOuterScope(innerCls: Class[_]): () => AnyRef = {
-    assert(innerCls.isMemberClass)
+    assert(Utils.isMemberClass(innerCls))
     val outerClassName = innerCls.getDeclaringClass.getName
     val outer = outerScopes.get(outerClassName)
     if (outer == null) {
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
index 8801c7d..b4e93eb 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
@@ -443,7 +443,7 @@ case class NewInstance(
     // Note that static inner classes (e.g., inner classes within Scala 
objects) don't need
     // outer pointer registration.
     val needOuterPointer =
-      outerPointer.isEmpty && cls.isMemberClass && 
!Modifier.isStatic(cls.getModifiers)
+      outerPointer.isEmpty && Utils.isMemberClass(cls) && 
!Modifier.isStatic(cls.getModifiers)
     childrenResolved && !needOuterPointer
   }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
index 2635264..095f6a9 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
@@ -217,6 +217,76 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
       "nested Scala class should work")
   }
 
+  object OuterLevelWithVeryVeryVeryLongClassName1 {
+    object OuterLevelWithVeryVeryVeryLongClassName2 {
+      object OuterLevelWithVeryVeryVeryLongClassName3 {
+        object OuterLevelWithVeryVeryVeryLongClassName4 {
+          object OuterLevelWithVeryVeryVeryLongClassName5 {
+            object OuterLevelWithVeryVeryVeryLongClassName6 {
+              object OuterLevelWithVeryVeryVeryLongClassName7 {
+                object OuterLevelWithVeryVeryVeryLongClassName8 {
+                  object OuterLevelWithVeryVeryVeryLongClassName9 {
+                    object OuterLevelWithVeryVeryVeryLongClassName10 {
+                      object OuterLevelWithVeryVeryVeryLongClassName11 {
+                        object OuterLevelWithVeryVeryVeryLongClassName12 {
+                          object OuterLevelWithVeryVeryVeryLongClassName13 {
+                            object OuterLevelWithVeryVeryVeryLongClassName14 {
+                              object OuterLevelWithVeryVeryVeryLongClassName15 
{
+                                object 
OuterLevelWithVeryVeryVeryLongClassName16 {
+                                  object 
OuterLevelWithVeryVeryVeryLongClassName17 {
+                                    object 
OuterLevelWithVeryVeryVeryLongClassName18 {
+                                      object 
OuterLevelWithVeryVeryVeryLongClassName19 {
+                                        object 
OuterLevelWithVeryVeryVeryLongClassName20 {
+                                          case class MalformedNameExample(x: 
Int)
+                                        }}}}}}}}}}}}}}}}}}}}
+
+  {
+    OuterScopes.addOuterScope(
+      OuterLevelWithVeryVeryVeryLongClassName1
+        .OuterLevelWithVeryVeryVeryLongClassName2
+        .OuterLevelWithVeryVeryVeryLongClassName3
+        .OuterLevelWithVeryVeryVeryLongClassName4
+        .OuterLevelWithVeryVeryVeryLongClassName5
+        .OuterLevelWithVeryVeryVeryLongClassName6
+        .OuterLevelWithVeryVeryVeryLongClassName7
+        .OuterLevelWithVeryVeryVeryLongClassName8
+        .OuterLevelWithVeryVeryVeryLongClassName9
+        .OuterLevelWithVeryVeryVeryLongClassName10
+        .OuterLevelWithVeryVeryVeryLongClassName11
+        .OuterLevelWithVeryVeryVeryLongClassName12
+        .OuterLevelWithVeryVeryVeryLongClassName13
+        .OuterLevelWithVeryVeryVeryLongClassName14
+        .OuterLevelWithVeryVeryVeryLongClassName15
+        .OuterLevelWithVeryVeryVeryLongClassName16
+        .OuterLevelWithVeryVeryVeryLongClassName17
+        .OuterLevelWithVeryVeryVeryLongClassName18
+        .OuterLevelWithVeryVeryVeryLongClassName19
+        .OuterLevelWithVeryVeryVeryLongClassName20)
+    encodeDecodeTest(
+      OuterLevelWithVeryVeryVeryLongClassName1
+        .OuterLevelWithVeryVeryVeryLongClassName2
+        .OuterLevelWithVeryVeryVeryLongClassName3
+        .OuterLevelWithVeryVeryVeryLongClassName4
+        .OuterLevelWithVeryVeryVeryLongClassName5
+        .OuterLevelWithVeryVeryVeryLongClassName6
+        .OuterLevelWithVeryVeryVeryLongClassName7
+        .OuterLevelWithVeryVeryVeryLongClassName8
+        .OuterLevelWithVeryVeryVeryLongClassName9
+        .OuterLevelWithVeryVeryVeryLongClassName10
+        .OuterLevelWithVeryVeryVeryLongClassName11
+        .OuterLevelWithVeryVeryVeryLongClassName12
+        .OuterLevelWithVeryVeryVeryLongClassName13
+        .OuterLevelWithVeryVeryVeryLongClassName14
+        .OuterLevelWithVeryVeryVeryLongClassName15
+        .OuterLevelWithVeryVeryVeryLongClassName16
+        .OuterLevelWithVeryVeryVeryLongClassName17
+        .OuterLevelWithVeryVeryVeryLongClassName18
+        .OuterLevelWithVeryVeryVeryLongClassName19
+        .OuterLevelWithVeryVeryVeryLongClassName20
+        .MalformedNameExample(42),
+      "deeply nested Scala class should work")
+  }
+
   productTest(PrimitiveData(1, 1, 1, 1, 1, 1, true))
 
   productTest(


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to