Yikun commented on PR #36980: URL: https://github.com/apache/spark/pull/36980#issuecomment-1166283208
Information sync: From the latest build result: https://github.com/Yikun/spark/runs/7051222363?check_suite_focus=true#step:7:127 , the cache works. - Current, CI failed due to: <details><summary>1. ModuleNotFoundError: No module named '_pickle'</summary> ``` Starting test(pypy3): pyspark.sql.tests.test_arrow (temp output: /tmp/pypy3__pyspark.sql.tests.test_arrow__jx96qdzs.log) Traceback (most recent call last): File "/usr/lib/pypy3.8/runpy.py", line 188, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/usr/lib/pypy3.8/runpy.py", line 111, in _get_module_details __import__(pkg_name) File "/__w/spark/spark/python/pyspark/__init__.py", line 59, in <module> from pyspark.rdd import RDD, RDDBarrier File "/__w/spark/spark/python/pyspark/rdd.py", line 54, in <module> from pyspark.java_gateway import local_connect_and_auth File "/__w/spark/spark/python/pyspark/java_gateway.py", line 32, in <module> from pyspark.serializers import read_int, write_with_length, UTF8Deserializer File "/__w/spark/spark/python/pyspark/serializers.py", line 68, in <module> from pyspark import cloudpickle File "/__w/spark/spark/python/pyspark/cloudpickle/__init__.py", line 4, in <module> from pyspark.cloudpickle.cloudpickle import * # noqa File "/__w/spark/spark/python/pyspark/cloudpickle/cloudpickle.py", line 57, in <module> from .compat import pickle File "/__w/spark/spark/python/pyspark/cloudpickle/compat.py", line 13, in <module> from _pickle import Pickler # noqa: F401 ModuleNotFoundError: No module named '_pickle' Had test failures in pyspark.sql.tests.test_arrow with pypy3; see logs. ``` </details> Build latest dockerfile pypy3 upgrade to 3.8 (original is 3.7), but it seems cloudpickle has a bug. This may related: https://github.com/cloudpipe/cloudpickle/commit/8bbea3e140767f51dd935a3c8f21c9a8e8702b7c , but I try to apply this, also failed. Need a deeper look, **if you guys know the reason of this, pls let me know.** <details><summary>2. fatal: unsafe repository</summary> ``` fatal: unsafe repository ('/__w/spark/spark' is owned by someone else) To add an exception for this directory, call: git config --global --add safe.directory /__w/spark/spark fatal: unsafe repository ('/__w/spark/spark' is owned by someone else) To add an exception for this directory, call: git config --global --add safe.directory /__w/spark/spark Error: Process completed with exit code 128. ``` </details> https://github.blog/2022-04-12-git-security-vulnerability-announced/ https://github.com/actions/checkout/issues/760 I do a quick fix, need submit a separate PR to address it. ```yaml - name: Github Actions permissions workaround run: | git config --global --add safe.directory ${GITHUB_WORKSPACE} ``` <details><summary>3. lint python <ufunc 'divide'></summary> ``` starting mypy annotations test... annotations failed mypy checks: python/pyspark/pandas/frame.py:9970: error: Need type annotation for "raveled_column_labels" [var-annotated] Found 1 error in 1 file (checked 337 source files) ``` </details> due to `numpy` upgrade, we could let numpy<=1.22.2 first. <details><summary>4. R lint error </summary> ``` Loading required namespace: SparkR Loading required namespace: lintr Failed with error: ‘there is no package called ‘lintr’’ Installing package into ‘/usr/lib/R/site-library’ (as ‘lib’ is unspecified) Error in contrib.url(repos, type) : trying to use CRAN without setting a mirror Calls: install.packages -> startsWith -> contrib.url Execution halted ``` </details> no idea about it: https://github.com/Yikun/spark/runs/7052215049?check_suite_focus=true <details><summary>5. sparkr </summary> ``` Loading required namespace: SparkR Loading required namespace: lintr Failed with error: ‘there is no package called ‘lintr’’ Installing package into ‘/usr/lib/R/site-library’ (as ‘lib’ is unspecified) Error in contrib.url(repos, type) : trying to use CRAN without setting a mirror Calls: install.packages -> startsWith -> contrib.url Execution halted ``` </details> no idea about it: https://github.com/Yikun/spark/runs/7052215214?check_suite_focus=true#step:9:10200 6. sparkr arrow related case failed: https://github.com/Yikun/spark/runs/7043826939?check_suite_focus=true#step:9:10904 no idea <details><summary>7. NotImplementedError: pandas-on-Spark objects currently do not support <ufunc 'divide'></summary> ``` ====================================================================== ERROR [2.102s]: test_arithmetic_op_exceptions (pyspark.pandas.tests.test_series_datetime.SeriesDateTimeTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/pandas/tests/test_series_datetime.py", line 99, in test_arithmetic_op_exceptions self.assertRaisesRegex(TypeError, expected_err_msg, lambda: other / psser) File "/usr/lib/python3.9/unittest/case.py", line 1276, in assertRaisesRegex return context.handle('assertRaisesRegex', args, kwargs) File "/usr/lib/python3.9/unittest/case.py", line 201, in handle callable_obj(*args, **kwargs) File "/__w/spark/spark/python/pyspark/pandas/tests/test_series_datetime.py", line 99, in <lambda> self.assertRaisesRegex(TypeError, expected_err_msg, lambda: other / psser) File "/__w/spark/spark/python/pyspark/pandas/base.py", line 465, in __array_ufunc__ raise NotImplementedError( NotImplementedError: pandas-on-Spark objects currently do not support <ufunc 'divide'>. ---------------------------------------------------------------------- ``` </details> due to `numpy` upgrade, we could let numpy<=1.22.2 first. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
