AngersZhuuuu commented on a change in pull request #30243:
URL: https://github.com/apache/spark/pull/30243#discussion_r532311172
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3957,3 +3957,198 @@ case class ArrayExcept(left: Expression, right:
Expression) extends ArrayBinaryL
override def prettyName: String = "array_except"
}
+
+/**
+ * Checks if the array (left) has the array (right)
+ */
+@ExpressionDescription(
+ usage = "_FUNC_(array1, array2) - Returns true if the array1 contains the
array2.",
Review comment:
> This `array_contains_array` is not the same like others, for example
the presto's
[prestosql/presto#5593](https://github.com/prestosql/presto/pull/5593).
>
> Looks at the implementation here, this `array_contains_array` treats array
as set and check if the intersect of two sets is the same as the first set.
>
> I'm not sure if there is a good definition for array contains array? For
example, how we treat element order? duplicates?
>
> It is not good if we need to fix these issues in the future for this
function and add legacy configs for behavior change. It is better to define the
function clearly when we add an expression.
Presto did same thing as what I did in this pr. but change to current way.
So hope a discussion about what's the final thing spark should do. An d I will
change the code.
also cc @maropu @kiszk @HyukjinKwon
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]