itholic commented on a change in pull request #33882:
URL: https://github.com/apache/spark/pull/33882#discussion_r701532586



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -2814,9 +2821,26 @@ def to_numeric(arg):
     1.0
     """
     if isinstance(arg, Series):
-        return arg._with_new_scol(arg.spark.column.cast("float"))
+        if errors == "coerce":
+            return arg._with_new_scol(arg.spark.column.cast("float"))
+        elif errors == "raise":
+            scol = arg.spark.column
+            scol_casted = scol.cast("float")
+            cond = scol.isNotNull() & scol_casted.isNull()
+            # Filter out if there are data that satisfy the condition.
+            sdf = arg._internal.spark_frame.select(scol).filter(cond)
+            head_sdf = sdf.head(1)

Review comment:
       I think it's not supported the way implementing `assert` with 
non-boolean objects ??
   
   I think because the Spark Column `col.isNotNull()` itself is non-boolean 
object, so the `assert` way is not supported?
   
   ```python
   >>> assert(scol.isNotNull())
   Traceback (most recent call last):
   ...
   ValueError: Cannot convert column into bool: please use '&' for 'and', '|' 
for 'or', '~' for 'not' when building DataFrame boolean expressions.
   ```

##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -2814,9 +2821,26 @@ def to_numeric(arg):
     1.0
     """
     if isinstance(arg, Series):
-        return arg._with_new_scol(arg.spark.column.cast("float"))
+        if errors == "coerce":
+            return arg._with_new_scol(arg.spark.column.cast("float"))
+        elif errors == "raise":
+            scol = arg.spark.column
+            scol_casted = scol.cast("float")
+            cond = scol.isNotNull() & scol_casted.isNull()
+            # Filter out if there are data that satisfy the condition.
+            sdf = arg._internal.spark_frame.select(scol).filter(cond)
+            head_sdf = sdf.head(1)

Review comment:
       I think it's not supported the way implementing `assert` with 
non-boolean objects ??
   
   Maybe because the Spark Column `col.isNotNull()` itself is non-boolean 
object, so the `assert` way is not supported?
   
   ```python
   >>> assert(scol.isNotNull())
   Traceback (most recent call last):
   ...
   ValueError: Cannot convert column into bool: please use '&' for 'and', '|' 
for 'or', '~' for 'not' when building DataFrame boolean expressions.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to