Github user gvramana commented on a diff in the pull request:
https://github.com/apache/spark/pull/2802#discussion_r21071969
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala
---
@@ -172,6 +177,8 @@ private[hive] case class
HiveGenericUdf(functionClassName: String, children: Seq
override def eval(input: Row): Any = {
returnInspector // Make sure initialized.
+ if(foldable) return constantReturnValue
--- End diff --
Constant check and returning value is required for two reasons:
1. some UDF functions returns constant iterator when
initializeAndFoldConstants called with constant iterators, by executing them
once.
But if the same are called with "function.evaluate" they will not return
the same constant value type. There will be mismatch in the datatype expected
by constantReturnInspector datatype vs datatype returned by
function.evaluate.(Ex: org.apache.hadoop.io.Text vs String). This fails unwrap.
So if return Inspector is constant we don't need to call
"function.evaluate" as the expression is already evaluated and return value is
already present in constant iterator.
This I have uncovered, when I made CreateArrayExression as foldable, then
test fails. So modified as part of current defect fix only.
2. Even if Literal is identified during optimization, expression is
evaluated twice, once during return Inspector creation and next during eval
function.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]