Github user kiszk commented on a diff in the pull request:
https://github.com/apache/spark/pull/21061#discussion_r181473537
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
---
@@ -287,3 +288,80 @@ case class ArrayContains(left: Expression, right:
Expression)
override def prettyName: String = "array_contains"
}
+
+/**
+ * Returns an array of the elements in the union of x and y, without
duplicates
+ */
+@ExpressionDescription(
+ usage = """
+ _FUNC_(array1, array2) - Returns an array of the elements in the union
of array1 and array2,
+ without duplicates.
+ """,
+ examples = """
+ Examples:
+ > SELECT _FUNC_(array(1, 2, 3), array(1, 3, 5));
+ array(1, 2, 3, 5)
+ """,
+ since = "2.4.0")
+case class ArrayUnion(left: Expression, right: Expression)
+ extends BinaryExpression with ExpectsInputTypes with CodegenFallback {
--- End diff --
Would it be possible to let us know why we should implement codegen
version? From the performance view, the codegen may not have advantage since
`union` method takes longer time.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]