aimtsou commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1444549904
Hi @srowen,
Thank you for your very prompt reply.
You are not correct about the error, after 1.20.0 it creates an attribute
error
```
if attr in __former_attrs__:
> raise AttributeError(__former_attrs__[attr])
E AttributeError: module 'numpy' has no attribute 'bool'.
E `np.bool` was a deprecated alias for the builtin `bool`. To
avoid this error in existing code, use `bool` by itself. Doing this will not
modify any behavior and is safe. If you specifically wanted the numpy scalar
type, use `np.bool_` here.
E The aliases was originally deprecated in NumPy 1.20; for more
details and guidance see the original release note at:
E
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
/usr/local/lib/python3.9/site-packages/numpy/__init__.py:305: AttributeError
```
This is the end of an error, coming after calling the function toPandas()
from my tests:
```
/usr/local/lib/python3.9/site-packages/<my-pkg>/unit/test_case_runner.py:26:
in run_test
self.assert_df_are_equal(expected_df, actual)
/usr/local/lib/python3.9/site-packages/<my-pkg>/unit/test_case_runner.py:58:
in assert_df_are_equal
self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/site-packages/<my-pkg>/spark_test_handler.py:38: in
compare_df
actual_pd = actual.toPandas().sort_values(by=sort_columns,
ignore_index=True)
/usr/local/lib/python3.9/site-packages/pyspark/sql/pandas/conversion.py:216:
in toPandas
pandas_type =
PandasConversionMixin._to_corrected_pandas_type(field.dataType)
/usr/local/lib/python3.9/site-packages/pyspark/sql/pandas/conversion.py:298:
in _to_corrected_pandas_type
return np.bool # type: ignore[attr-defined]
```
And the error does not come from the numpy in the system but by the numpy
inside pyspark
I agree about the comments on databricks but as shown above this does not
work on Spark 3.3.1 independently if you want to be compliant with Databricks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]