Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/10391#discussion_r48512443
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects.scala
---
@@ -439,6 +439,13 @@ case class MapObjects(
s"boolean ${loopVar.isNull} = ${genInputData.isNull} ||
${loopVar.value} == null;"
}
+ // If lambdaFunction is WrapOption, we will not determine null or not
based on the
+ // value of loopVar.isNull, because WrapOption will return None for
null.
+ val isWrapOption = lambdaFunction match {
--- End diff --
Let me explain it.
When we pass in an array with None. It will be encoded as null internally.
When we decode it back, WrapOption is called to re-construct it.
The logic of MapObjects is to assign an element as null if its given input
element is null. So It will not actually go into WrapOption to re-construct a
None back. In order to do that, we need to call lambdaFunction even the element
is null.
But we can't simply ignore loopVar.isNull and call all kinds of
lambdaFunctions. I tried before but for some lambdaFunctions, a null input
value causes problematic results.
In the end I can only check if lambdaFunction is WrapOption or not to make
the decision here. Do you have other suggestion other than a hack like this
here? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]