Re: [PR] [SPARK-46165][PS] Add support for pandas.DataFrame.all axis=1 [spark]

via GitHub Mon, 12 Jan 2026 13:12:59 -0800


holdenk commented on code in PR #53507:
URL: https://github.com/apache/spark/pull/53507#discussion_r2683927712



##########
python/pyspark/pandas/frame.py:
##########
@@ -11118,28 +11120,58 @@ def all(
         dtype: bool
         """
         axis = validate_axis(axis)
-        if axis != 0:
-            raise NotImplementedError('axis should be either 0 or "index" 
currently.')
-
         column_labels = self._internal.column_labels
         if bool_only:
             column_labels = self._bool_column_labels(column_labels)
         if len(column_labels) == 0:
             return ps.Series([], dtype=bool)
+        if axis == 0:
+            applied: List[PySparkColumn] = []
+            for label in column_labels:
+                scol = self._internal.spark_column_for(label)
 
-        applied: List[PySparkColumn] = []
-        for label in column_labels:
-            scol = self._internal.spark_column_for(label)
+                if isinstance(self._internal.spark_type_for(label), 
NumericType) or skipna:
+                    # np.nan takes no effect to the result; None takes no 
effect if `skipna`
+                    all_col = F.min(F.coalesce(scol.cast("boolean"), 
F.lit(True)))
+                else:
+                    # Take None as False when not `skipna`
+                    all_col = F.min(
+                        F.when(scol.isNull(), 
F.lit(False)).otherwise(scol.cast("boolean"))
+                    )
+                applied.append(F.when(all_col.isNull(), 
True).otherwise(all_col))
 
-            if isinstance(self._internal.spark_type_for(label), NumericType) 
or skipna:
-                # np.nan takes no effect to the result; None takes no effect 
if `skipna`
-                all_col = F.min(F.coalesce(scol.cast("boolean"), F.lit(True)))
-            else:
-                # Take None as False when not `skipna`
-                all_col = F.min(F.when(scol.isNull(), 
F.lit(False)).otherwise(scol.cast("boolean")))
-            applied.append(F.when(all_col.isNull(), True).otherwise(all_col))
+            return self._result_aggregated(column_labels, applied)
+        elif axis == 1:
+            from pyspark.pandas.series import first_series
 
-        return self._result_aggregated(column_labels, applied)
+            sdf = self._internal.spark_frame.select(
+                *self._internal_frame.index_spark_columns,
+                F.least(
+                    *[
+                        F.coalesce(
+                            
self._internal.spark_column_for(label).cast("boolean"),
+                            # pandas treats all NA values as True in `all()`
+                            F.lit(True),

Review Comment:
   Shouldn't this depend on skipNA's value? Although I see the test works 
without it so I'm probably missing soemthing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-46165][PS] Add support for pandas.DataFrame.all axis=1 [spark]

Reply via email to