This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
     new 9615c0e  [SPARK-34596][SQL] Use Utils.getSimpleName to avoid hitting 
Malformed class name in NewInstance.doGenCode
9615c0e is described below

commit 9615c0ec1ed3f787756a453a9744f22077ce9251
Author: Kris Mok <kris....@databricks.com>
AuthorDate: Wed Mar 3 12:22:51 2021 +0900

    [SPARK-34596][SQL] Use Utils.getSimpleName to avoid hitting Malformed class 
name in NewInstance.doGenCode
    
    ### What changes were proposed in this pull request?
    
    Use `Utils.getSimpleName` to avoid hitting `Malformed class name` error in 
`NewInstance.doGenCode`.
    
    ### Why are the changes needed?
    
    On older JDK versions (e.g. JDK8u), nested Scala classes may trigger 
`java.lang.Class.getSimpleName` to throw an `java.lang.InternalError: Malformed 
class name` error.
    In this particular case, creating an `ExpressionEncoder` on such a nested 
Scala class would create a `NewInstance` expression under the hood, which will 
trigger the problem during codegen.
    
    Similar to https://github.com/apache/spark/pull/29050, we should use  
Spark's `Utils.getSimpleName` utility function in place of 
`Class.getSimpleName` to avoid hitting the issue.
    
    There are two other occurrences of `java.lang.Class.getSimpleName` in the 
same file, but they're safe because they're only guaranteed to be only used on 
Java classes, which don't have this problem, e.g.:
    ```scala
        // Make a copy of the data if it's unsafe-backed
        def makeCopyIfInstanceOf(clazz: Class[_ <: Any], value: String) =
          s"$value instanceof ${clazz.getSimpleName}? ${value}.copy() : $value"
        val genFunctionValue: String = lambdaFunction.dataType match {
          case StructType(_) => makeCopyIfInstanceOf(classOf[UnsafeRow], 
genFunction.value)
          case ArrayType(_, _) => 
makeCopyIfInstanceOf(classOf[UnsafeArrayData], genFunction.value)
          case MapType(_, _, _) => makeCopyIfInstanceOf(classOf[UnsafeMapData], 
genFunction.value)
          case _ => genFunction.value
        }
    ```
    The Unsafe-* family of types are all Java types, so they're okay.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Fixes a bug that throws an error when using `ExpressionEncoder` on some 
nested Scala types, otherwise no changes.
    
    ### How was this patch tested?
    
    Added a test case to 
`org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite`. It'll fail on 
JDK8u before the fix, and pass after the fix.
    
    Closes #31709 from rednaxelafx/spark-34596-master.
    
    Authored-by: Kris Mok <kris....@databricks.com>
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
    (cherry picked from commit ecf4811764f1ef91954c865a864e0bf6691f99a6)
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
---
 .../spark/sql/catalyst/expressions/objects/objects.scala     |  2 +-
 .../spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala | 12 ++++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
index f391b31..8801c7d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
@@ -489,7 +489,7 @@ case class NewInstance(
       // that might be defined on the companion object.
       case 0 => s"$className$$.MODULE$$.apply($argString)"
       case _ => outer.map { gen =>
-        s"${gen.value}.new ${cls.getSimpleName}($argString)"
+        s"${gen.value}.new ${Utils.getSimpleName(cls)}($argString)"
       }.getOrElse {
         s"new $className($argString)"
       }
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
index f2598a9..2635264 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
@@ -205,6 +205,18 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
 
   encodeDecodeTest(Array(Option(InnerClass(1))), "array of optional inner 
class")
 
+  // holder class to trigger Class.getSimpleName issue
+  object MalformedClassObject extends Serializable {
+    case class MalformedNameExample(x: Int)
+  }
+
+  {
+    OuterScopes.addOuterScope(MalformedClassObject)
+    encodeDecodeTest(
+      MalformedClassObject.MalformedNameExample(42),
+      "nested Scala class should work")
+  }
+
   productTest(PrimitiveData(1, 1, 1, 1, 1, 1, true))
 
   productTest(


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to