Github user EntilZha commented on the pull request:
https://github.com/apache/spark/pull/7580#issuecomment-123521163
Consolidating all my questions here, then deleting prior comments:
1. What is incorrect about my codegen which is returning an error?
Secondary to that, does it make sense to define the code block as I have, or
use a helper method?
2. I have defined `checkInputDataTypes`. Based on the requirement to check
that argument one is of type `Array[T]` and that the value to check is of type
`T`, it doesn't look like `ExpectsInputTypes` is sufficient. Is this correct?
3. Does it make sense to define `eval`? The default scenario is that if
there is a null input, to return null. Since that behavior doesn't match what
hive does, it seems to make sense to define a custom `eval` which takes care of
null checks on `left` and `right`.
4. Lastly, I am having trouble testing that if given a null argument,
`array_contains` should return false. This is due to `checkInputTypes` throwing
a runtime error if there is a type mismatch, in this case because `null` is not
an `Integer`. What is the correct way to test this, if it makes sense to do so?
5. Do I need to be checking if types are comparable? If so, will this be a
scala/java thing, or does spark have a notion of comparable?
Tests:
```
test("array contains function") {
val df = Seq(
(Array[Int](1, 2), "x"),
(Array[Int](), "y"),
(null, "z")
).toDF("a", "b")
checkAnswer(
df.select(array_contains("a", null)),
Seq(Row(false), Row(false), Row(false))
)
checkAnswer(
df.selectExpr("array_contains(null, 1)"),
Seq(Row(false), Row(false), Row(false))
)
checkAnswer(
df.selectExpr("array_contains(a, null)"),
Seq(Row(false), Row(false))
)
}
```
Which triggers an exception like this:
`org.apache.spark.sql.AnalysisException: cannot resolve
'array_contains(null,1)' due to data type mismatch: type of first input must be
an array, not null;` from my `checkInputTypes`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]