raulcd opened a new issue, #48825: URL: https://github.com/apache/arrow/issues/48825
### Describe the bug, including details regarding any error messages, version, and platform. Our nightly job for integration with spark has bee failing [test-conda-python-3.11-spark-master](https://github.com/ursacomputing/crossbow/actions/runs/20869818141/job/59969096553) during the last days with: ``` ====================================================================== ERROR: setUpClass (pyspark.testing.sqlutils.ReusedSQLTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/spark/python/pyspark/testing/sqlutils.py", line 211, in setUpClass super().setUpClass() File "/spark/python/pyspark/testing/utils.py", line 311, in setUpClass cls.sc = SparkContext(cls.master(), cls.__name__, conf=cls.conf()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/spark/python/pyspark/core/context.py", line 208, in __init__ self._do_init( File "/spark/python/pyspark/core/context.py", line 301, in _do_init self._jsc = jsc or self._initialize_context(self._conf._jconf) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/spark/python/pyspark/core/context.py", line 448, in _initialize_context return self._jvm.JavaSparkContext(jconf) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/spark/python/lib/py4j-0.10.9.9-src.zip/py4j/java_gateway.py", line 1627, in __call__ return_value = get_return_value( ^^^^^^^^^^^^^^^^^ File "/spark/python/lib/py4j-0.10.9.9-src.zip/py4j/protocol.py", line 327, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.lang.NoClassDefFoundError: org/eclipse/jetty/session/SessionManager at org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:123) at org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:110) at org.apache.spark.metrics.sink.MetricsServlet.getHandlers(MetricsServlet.scala:50) at org.apache.spark.metrics.MetricsSystem.$anonfun$getServletHandlers$2(MetricsSystem.scala:91) at scala.Option.map(Option.scala:242) at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91) at org.apache.spark.SparkContext.<init>(SparkContext.scala:702) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500) at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374) at py4j.Gateway.invoke(Gateway.java:238) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:184) at py4j.ClientServerConnection.run(ClientServerConnection.java:108) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: java.lang.ClassNotFoundException: org.eclipse.jetty.session.SessionManager at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) ... 21 more ---------------------------------------------------------------------- Ran 33 tests in 7.919s FAILED (errors=1, skipped=33) ``` The job also contains a bunch of comments about skipping tests, we should also investigate those. ``` test_with_key_left_group_empty (__main__.CogroupedApplyInPandasTests.test_with_key_left_group_empty) ... skipped '[PACKAGE_NOT_INSTALLED] Pandas >= 2.2.0 must be installed; however, it was not found.' test_with_key_right (__main__.CogroupedApplyInPandasTests.test_with_key_right) ... skipped '[PACKAGE_NOT_INSTALLED] Pandas >= 2.2.0 must be installed; however, it was not found.' test_with_key_right_group_empty (__main__.CogroupedApplyInPandasTests.test_with_key_right_group_empty) ... skipped '[PACKAGE_NOT_INSTALLED] Pandas >= 2.2.0 must be installed; however, it was not found.' test_with_local_data (__main__.CogroupedApplyInPandasTests.test_with_local_data) ... skipped '[PACKAGE_NOT_INSTALLED] Pandas >= 2.2.0 must be installed; however, it was not found.' test_with_window_function (__main__.CogroupedApplyInPandasTests.test_with_window_function) ... skipped '[PACKAGE_NOT_INSTALLED] Pandas >= 2.2.0 must be installed; however, it was not found.' test_wrong_args (__main__.CogroupedApplyInPandasTests.test_wrong_args) ... skipped '[PACKAGE_NOT_INSTALLED] Pandas >= 2.2.0 must be installed; however, it was not found.' test_wrong_return_type (__main__.CogroupedApplyInPandasTests.test_wrong_return_type) ... skipped '[PACKAGE_NOT_INSTALLED] Pandas >= 2.2.0 must be installed; however, it was not found.' ``` ### Component(s) Continuous Integration, Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
