Github user EntilZha commented on the pull request:

    https://github.com/apache/spark/pull/7580#issuecomment-123521163
  
    Consolidating all my questions here, then deleting prior comments:
    1. What is incorrect about my codegen which is returning an error? 
Secondary to that, does it make sense to define the code block as I have, or 
use a helper method?
    2. I have defined `checkInputDataTypes`. Based on the requirement to check 
that argument one is of type `Array[T]` and that the value to check is of type 
`T`, it doesn't look like `ExpectsInputTypes` is sufficient. Is this correct?
    3. Does it make sense to define `eval`? The default scenario is that if 
there is a null input, to return null. Since that behavior doesn't match what 
hive does, it seems to make sense to define a custom `eval` which takes care of 
null checks on `left` and `right`.
    4. Lastly, I am having trouble testing that if given a null argument, 
`array_contains` should return false. This is due to `checkInputTypes` throwing 
a runtime error if there is a type mismatch, in this case because `null` is not 
an `Integer`. What is the correct way to test this, if it makes sense to do so?
    5. Do I need to be checking if types are comparable? If so, will this be a 
scala/java thing, or does spark have a notion of comparable?
    
    Tests:
    ```
      test("array contains function") {
        val df = Seq(
          (Array[Int](1, 2), "x"),
          (Array[Int](), "y"),
          (null, "z")
        ).toDF("a", "b")
        checkAnswer(
          df.select(array_contains("a", null)),
          Seq(Row(false), Row(false), Row(false))
        )
        checkAnswer(
          df.selectExpr("array_contains(null, 1)"),
          Seq(Row(false), Row(false), Row(false))
        )
        checkAnswer(
          df.selectExpr("array_contains(a, null)"),
          Seq(Row(false), Row(false))
        )
      }
    ```
    
    Which triggers an exception like this: 
`org.apache.spark.sql.AnalysisException: cannot resolve 
'array_contains(null,1)' due to data type mismatch: type of first input must be 
an array, not null;` from my `checkInputTypes`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to