Re: [PR] [SPARK-46168][PS] Add axis argument for idxmax [spark]

via GitHub Wed, 18 Feb 2026 12:08:59 -0800


devin-petersohn commented on code in PR #54044:
URL: https://github.com/apache/spark/pull/54044#discussion_r2824234109



##########
python/pyspark/pandas/frame.py:
##########
@@ -4906,49 +4906,49 @@ def nunique(
         rsd: float = 0.05,
     ) -> "Series":
         """
-        Return number of unique elements in the object.
-
-        Excludes NA values by default.
-
-        Parameters
-        ----------
-        axis : int, default 0 or 'index'
-            Can only be set to 0 now.
-        dropna : bool, default True
-            Don’t include NaN in the count.
-        approx: bool, default False
-            If False, will use the exact algorithm and return the exact number 
of unique.
-            If True, it uses the HyperLogLog approximate algorithm, which is 
significantly faster
-            for large amounts of data.
-            Note: This parameter is specific to pandas-on-Spark and is not 
found in pandas.
-        rsd: float, default 0.05
-            Maximum estimation error allowed in the HyperLogLog algorithm.
-            Note: Just like ``approx`` this parameter is specific to 
pandas-on-Spark.
-
-        Returns
-        -------
-        The number of unique values per column as a pandas-on-Spark Series.
-
-        Examples
-        --------
-        >>> df = ps.DataFrame({'A': [1, 2, 3], 'B': [np.nan, 3, np.nan]})
-        >>> df.nunique()
-        A    3
-        B    1
-        dtype: int64
-
-        >>> df.nunique(dropna=False)
-        A    3
-        B    2
-        dtype: int64
-
-        On big data, we recommend using the approximate algorithm to speed up 
this function.
-        The result will be very close to the exact unique count.
-
-        >>> df.nunique(approx=True)
-        A    3
-        B    1
-        dtype: int64

Review Comment:
   Yes, done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-46168][PS] Add axis argument for idxmax [spark]

Reply via email to