Github user rednaxelafx commented on a diff in the pull request: https://github.com/apache/spark/pull/20757#discussion_r173010942 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1408,11 +1409,37 @@ case class ValidateExternalType(child: Expression, expected: DataType) override def dataType: DataType = RowEncoder.externalDataTypeForInput(expected) - override def eval(input: InternalRow): Any = - throw new UnsupportedOperationException("Only code-generated evaluation is supported") - private val errMsg = s" is not a valid external type for schema of ${expected.simpleString}" + private lazy val checkType = expected match { + case _: DecimalType => + (value: Any) => { + Seq(classOf[java.math.BigDecimal], classOf[scala.math.BigDecimal], classOf[Decimal]) + .exists { x => value.getClass.isAssignableFrom(x) } + } + case _: ArrayType => + (value: Any) => { + value.getClass.isAssignableFrom(classOf[Seq[_]]) || value.getClass.isArray --- End diff -- For those curious: In HotSpot, the straightforward interpreter/C1 implementation of `xxx.getClass().isArray()` path is actually something like: ``` // for getClass() klazz = xxx._klass; // read the hidden klass pointer field from the object header clazz = klazz._java_mirror; // read the java.lang.Class reference from the Klass // for clazz.isArray(): go through JNI and call the native JVM_IsArrayClass() inside HotSpot klazz1 = clazz->_klass; result = klazz1->oop_is_array(); ``` So a JNI native method call is involved and that's not really fast. But C2 will optimize this into something similar to: ``` klazz = xxx._klass; result = inlined klazz->oop_is_array(); ``` So that's pretty fast. No need to load the `java.lang.Class` (aka "Java Mirroe") reference anymore. In the `xxx.isInstanceOf[Seq[_]]` case, again the interpreter version would go through a JNI native method call, whereas the C1/C2 versions will inline a fast path logic and do a quick comparison against a per-type cache. This fast path check has similar overhead to the C2 `isArray()` overhead.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org