HyukjinKwon commented on code in PR #42596:
URL: https://github.com/apache/spark/pull/42596#discussion_r1300942561


##########
python/pyspark/sql/functions.py:
##########
@@ -3669,38 +3669,83 @@ def approxCountDistinct(col: "ColumnOrName", rsd: 
Optional[float] = None) -> Col
 
 @try_remote_functions
 def approx_count_distinct(col: "ColumnOrName", rsd: Optional[float] = None) -> 
Column:
-    """Aggregate function: returns a new :class:`~pyspark.sql.Column` for 
approximate distinct count
-    of column `col`.
+    """
+    Applies an aggregate function to return an approximate distinct count of 
the specified column.
 
-    .. versionadded:: 2.1.0
+    This function returns a new :class:`~pyspark.sql.Column` that estimates 
the number of distinct
+    elements in a column or a group of columns.
 
-    .. versionchanged:: 3.4.0
-        Supports Spark Connect.
+    .. versionadded:: 2.1.0
 
     .. versionchanged:: 3.4.0
         Supports Spark Connect.
 
     Parameters
     ----------
     col : :class:`~pyspark.sql.Column` or str
+        The label of the column to count distinct values in.
     rsd : float, optional
-        maximum relative standard deviation allowed (default = 0.05).
-        For rsd < 0.01, it is more efficient to use :func:`count_distinct`
+        The maximum allowed relative standard deviation (default = 0.05).
+        If rsd < 0.01, it would be more efficient to use 
:func:`count_distinct`.
 
     Returns
     -------
     :class:`~pyspark.sql.Column`
-        the column of computed results.
+        A new Column object representing the approximate unique count.
+
+    See Also
+    ----------
+    :meth:`pyspark.sql.functions.count_distinct`
 
     Examples
     --------
+    Example 1: Counting distinct values in a single column DataFrame 
representing integers
+
+    >>> from pyspark.sql.functions import approx_count_distinct

Review Comment:
   and second reason is that `from pyspark.sql.functions import 
approx_count_distinct` is perfectly fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to