Github user DylanGuedes commented on a diff in the pull request:
https://github.com/apache/spark/pull/21045#discussion_r193759057
--- Diff: python/pyspark/sql/functions.py ---
@@ -2394,6 +2394,23 @@ def array_repeat(col, count):
return Column(sc._jvm.functions.array_repeat(_to_java_column(col),
count))
+@since(2.4)
+def zip(*cols):
+ """
+ Collection function: Merge two columns into one, such that the M-th
element of the N-th
+ argument will be the N-th field of the M-th output element.
+
+ :param cols: columns in input
+
+ >>> from pyspark.sql.functions import zip as spark_zip
--- End diff --
I think that we should stick with something related to zip (such as
"zip_arrays" or "zip_lists") for "compatibility naming" with other
APIs/languages (`Enum.zip` in Elixir and `zip` in Scala, for instance).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]