[GitHub] [spark] cloud-fan commented on a change in pull request #32527: [SPARK-35384][SQL] Improve performance for InvokeLike.invoke
cloud-fan commented on a change in pull request #32527: URL: https://github.com/apache/spark/pull/32527#discussion_r631974591 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -127,13 +128,18 @@ trait InvokeLike extends Expression with NonSQLExpression { arguments: Seq[Expression], input: InternalRow, dataType: DataType): Any = { -val args = arguments.map(e => e.eval(input).asInstanceOf[Object]) -if (needNullCheck && args.exists(_ == null)) { +var i = 0 +val len = arguments.length +while (i < len) { + evaluatedArgs(i) = arguments(i).eval(input).asInstanceOf[Object] + i += 1 +} +if (needNullCheck && evaluatedArgs.contains(null)) { // return null if one of arguments is null null } else { val ret = try { -method.invoke(obj, args: _*) +method.invoke(obj, evaluatedArgs: _*) } catch { Review comment: makes sense, for UDF, it's just an extra `method.isDefine` check, and probably not a big issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32527: [SPARK-35384][SQL] Improve performance for InvokeLike.invoke
cloud-fan commented on a change in pull request #32527: URL: https://github.com/apache/spark/pull/32527#discussion_r631596017 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -127,13 +128,18 @@ trait InvokeLike extends Expression with NonSQLExpression { arguments: Seq[Expression], input: InternalRow, dataType: DataType): Any = { -val args = arguments.map(e => e.eval(input).asInstanceOf[Object]) -if (needNullCheck && args.exists(_ == null)) { +var i = 0 +val len = arguments.length +while (i < len) { + evaluatedArgs(i) = arguments(i).eval(input).asInstanceOf[Object] + i += 1 +} +if (needNullCheck && evaluatedArgs.contains(null)) { // return null if one of arguments is null null } else { val ret = try { -method.invoke(obj, args: _*) +method.invoke(obj, evaluatedArgs: _*) } catch { Review comment: You are right. Another idea: `obj` from `InternalRow` are always of the same class, we can avoid this ``` @transient lazy val method = { val cls = targetObject.dataType match { case ObjectType(cls) => cls case StringType => classOf[UTF8String] case _: DecimalType => classOf[Decimal] ... } findMethod(cls, encodedFunctionName, argClasses) } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32527: [SPARK-35384][SQL] Improve performance for InvokeLike.invoke
cloud-fan commented on a change in pull request #32527: URL: https://github.com/apache/spark/pull/32527#discussion_r631561074 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -127,13 +128,18 @@ trait InvokeLike extends Expression with NonSQLExpression { arguments: Seq[Expression], input: InternalRow, dataType: DataType): Any = { -val args = arguments.map(e => e.eval(input).asInstanceOf[Object]) -if (needNullCheck && args.exists(_ == null)) { +var i = 0 +val len = arguments.length +while (i < len) { + evaluatedArgs(i) = arguments(i).eval(input).asInstanceOf[Object] + i += 1 +} +if (needNullCheck && evaluatedArgs.contains(null)) { // return null if one of arguments is null null } else { val ret = try { -method.invoke(obj, args: _*) +method.invoke(obj, evaluatedArgs: _*) } catch { Review comment: We can do the similar thing in `Invoke.eval` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32527: [SPARK-35384][SQL] Improve performance for InvokeLike.invoke
cloud-fan commented on a change in pull request #32527: URL: https://github.com/apache/spark/pull/32527#discussion_r631560800 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -127,13 +128,18 @@ trait InvokeLike extends Expression with NonSQLExpression { arguments: Seq[Expression], input: InternalRow, dataType: DataType): Any = { -val args = arguments.map(e => e.eval(input).asInstanceOf[Object]) -if (needNullCheck && args.exists(_ == null)) { +var i = 0 +val len = arguments.length +while (i < len) { + evaluatedArgs(i) = arguments(i).eval(input).asInstanceOf[Object] + i += 1 +} +if (needNullCheck && evaluatedArgs.contains(null)) { // return null if one of arguments is null null } else { val ret = try { -method.invoke(obj, args: _*) +method.invoke(obj, evaluatedArgs: _*) } catch { Review comment: Can we also improve the last piece? ``` val boxedClass = ScalaReflection.typeBoxedJavaMapping.get(dataType) if (boxedClass.isDefined) { boxedClass.get.cast(ret) } else { ret } ``` We can create a function for it ``` private lazy val boxing: Any => Any = ScalaReflection.typeBoxedJavaMapping.get(dataType).map(_.cast(_)).getOrElse(identity) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org