[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876304#comment-16876304 ]
Alex Kushnir commented on SPARK-28186: -------------------------------------- because array ["a","b",null,"c"] clearly does not contain "d" and I would expect it to return false and not null. Why are you saying that this is correct behavior? > array_contains returns null instead of false when one of the items in the > array is null > --------------------------------------------------------------------------------------- > > Key: SPARK-28186 > URL: https://issues.apache.org/jira/browse/SPARK-28186 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Reporter: Alex Kushnir > Priority: Major > > If array of items contains a null item then array_contains returns true if > item is found but if item is not found it returns null instead of false > Seq( > (1, Seq("a", "b", "c")), > (2, Seq("a", "b", null, "c")) > ).toDF("id", "vals").createOrReplaceTempView("tbl") > spark.sql("select id, vals, array_contains(vals, 'a') as has_a, > array_contains(vals, 'd') as has_d from tbl").show > +----+---------++----------+ > |id|vals|has_a|has_d| > +----+---------++----------+ > |1|[a, b, c]|true|false| > |2|[a, b,, c]|true|null| > +----+---------++----------+ -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org