BryanCutler commented on a change in pull request #29951:
URL: https://github.com/apache/spark/pull/29951#discussion_r500050618
##########
File path: python/pyspark/sql/pandas/serializers.py
##########
@@ -153,13 +153,16 @@ def create_array(s, t):
s = s.astype(s.dtypes.categories.dtype)
try:
array = pa.Array.from_pandas(s, mask=mask, type=t,
safe=self._safecheck)
- except pa.ArrowException as e:
- error_msg = "Exception thrown when converting pandas.Series
(%s) to Arrow " + \
- "Array (%s). It can be caused by overflows or
other unsafe " + \
- "conversions warned by Arrow. Arrow safe type
check can be " + \
- "disabled by using SQL config " + \
-
"`spark.sql.execution.pandas.convertToArrowArraySafely`."
- raise RuntimeError(error_msg % (s.dtype, t), e)
+ except ValueError as e:
Review comment:
errors during safe conversion will be `ArrowInvalid`, which subclasses
ValueError
##########
File path: python/pyspark/sql/pandas/serializers.py
##########
@@ -153,13 +153,16 @@ def create_array(s, t):
s = s.astype(s.dtypes.categories.dtype)
try:
array = pa.Array.from_pandas(s, mask=mask, type=t,
safe=self._safecheck)
- except pa.ArrowException as e:
- error_msg = "Exception thrown when converting pandas.Series
(%s) to Arrow " + \
- "Array (%s). It can be caused by overflows or
other unsafe " + \
- "conversions warned by Arrow. Arrow safe type
check can be " + \
- "disabled by using SQL config " + \
-
"`spark.sql.execution.pandas.convertToArrowArraySafely`."
- raise RuntimeError(error_msg % (s.dtype, t), e)
+ except ValueError as e:
+ if self._safecheck:
+ error_msg = "Exception thrown when converting
pandas.Series (%s) to " + \
+ "Arrow Array (%s). It can be caused by
overflows or other " + \
+ "unsafe conversions warned by Arrow. Arrow
safe type check " + \
+ "can be disabled by using SQL config " + \
+
"`spark.sql.execution.pandas.convertToArrowArraySafely`."
+ raise ValueError(error_msg % (s.dtype, t)) from e
Review comment:
Now that we dropped Python 2, this seems more appropriate
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]