zhengruifeng commented on code in PR #43145:
URL: https://github.com/apache/spark/pull/43145#discussion_r1338529056
##########
python/pyspark/sql/functions.py:
##########
@@ -12052,24 +12132,68 @@ def array_join(
Parameters
----------
col : :class:`~pyspark.sql.Column` or str
- target column to work on.
+ The input column containing the arrays to be joined.
delimiter : str
- delimiter used to concatenate elements
+ The string to be used as the delimiter when joining the array elements.
null_replacement : str, optional
- if set then null values will be replaced by this value
+ The string to replace null values within the array. If not set, null
values are ignored.
Returns
-------
:class:`~pyspark.sql.Column`
- a column of string type. Concatenated values.
+ A new column of string type, where each value is the result of joining
the corresponding
+ array from the input column.
Examples
--------
- >>> df = spark.createDataFrame([(["a", "b", "c"],), (["a", None],)],
['data'])
- >>> df.select(array_join(df.data, ",").alias("joined")).collect()
- [Row(joined='a,b,c'), Row(joined='a')]
- >>> df.select(array_join(df.data, ",", "NULL").alias("joined")).collect()
- [Row(joined='a,b,c'), Row(joined='a,NULL')]
+ Example 1: Basic usage of array_join function.
+
+ >>> from pyspark.sql import functions as sf
+ >>> df = spark.createDataFrame([(["a", "b", "c"],), (["a", "b"],)],
['data'])
+ >>> df.select(sf.array_join(df.data, ",").alias("joined")).show()
Review Comment:
let's remove all the alias in this PR, if possible
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]