Github user rednaxelafx commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20757#discussion_r173010942
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
    @@ -1408,11 +1409,37 @@ case class ValidateExternalType(child: Expression, 
expected: DataType)
     
       override def dataType: DataType = 
RowEncoder.externalDataTypeForInput(expected)
     
    -  override def eval(input: InternalRow): Any =
    -    throw new UnsupportedOperationException("Only code-generated 
evaluation is supported")
    -
       private val errMsg = s" is not a valid external type for schema of 
${expected.simpleString}"
     
    +  private lazy val checkType = expected match {
    +    case _: DecimalType =>
    +      (value: Any) => {
    +        Seq(classOf[java.math.BigDecimal], classOf[scala.math.BigDecimal], 
classOf[Decimal])
    +          .exists { x => value.getClass.isAssignableFrom(x) }
    +      }
    +    case _: ArrayType =>
    +      (value: Any) => {
    +        value.getClass.isAssignableFrom(classOf[Seq[_]]) || 
value.getClass.isArray
    --- End diff --
    
    For those curious:
    
    In HotSpot, the straightforward interpreter/C1 implementation of 
`xxx.getClass().isArray()` path is actually something like:
    ```
    // for getClass()
    klazz = xxx._klass; // read the hidden klass pointer field from the object 
header
    clazz = klazz._java_mirror; // read the java.lang.Class reference from the 
Klass
    // for clazz.isArray(): go through JNI and call the native 
JVM_IsArrayClass() inside HotSpot
    klazz1 = clazz->_klass;
    result = klazz1->oop_is_array();
    ```
    So a JNI native method call is involved and that's not really fast. But C2 
will optimize this into something similar to:
    ```
    klazz = xxx._klass;
    result = inlined klazz->oop_is_array();
    ```
    So that's pretty fast. No need to load the `java.lang.Class`  (aka "Java 
Mirroe") reference anymore.
    
    In the `xxx.isInstanceOf[Seq[_]]` case, again the interpreter version would 
go through a JNI native method call, whereas the C1/C2 versions will inline a 
fast path logic and do a quick comparison against a per-type cache. This fast 
path check has similar overhead to the C2 `isArray()` overhead.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to