Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21073#discussion_r183600236
--- Diff: python/pyspark/sql/functions.py ---
@@ -2186,6 +2186,29 @@ def map_values(col):
return Column(sc._jvm.functions.map_values(_to_java_column(col)))
+@since(2.4)
+def map_concat(*cols):
+ """Returns the union of all the given maps. If a key is found in
multiple given maps,
+ that key's value in the resulting map comes from the last one of those
maps.
+
+ :param cols: list of column names (string) or list of :class:`Column`
expressions
+
+ >>> from pyspark.sql.functions import map_concat
+ >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(3, 'c', 1,
'd') as map2")
+ >>> df.select(map_concat("map1",
"map2").alias("map3")).show(truncate=False)
+ +------------------------+
+ |map3 |
+ +------------------------+
+ |[1 -> d, 2 -> b, 3 -> c]|
+ +------------------------+
+ """
+ sc = SparkContext._active_spark_context
+ if len(cols) == 1 and isinstance(cols[0], (list, set)):
+ cols = cols[0]
--- End diff --
>what's this for?
Excellent question. I don't know, except that it seems sometimes the first
column is a list of columns. I used other functions as a template.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]