HyukjinKwon opened a new pull request, #49500: URL: https://github.com/apache/spark/pull/49500
### What changes were proposed in this pull request? This PR proposes to avoid importing optional Python packages for checking, by using `importlib.util.find_spec` instead of actually loading/importing the package. ### Why are the changes needed? https://github.com/apache/spark/commit/a40919912f5ce7f63fff2907b30e473dd4155227 changed to import optional dependencies in main code. After that, technically https://github.com/apache/spark/commit/f223b8da9e23e4e028e145e0d4dd74eeae5d2d52 broke the Spark Core tests, but it did not run the tests (because now we will import `pyspark.testing`, and it will import optional dependencies). By importing `deepspeed`, via logger, it can show stdout (https://github.com/microsoft/DeepSpeed/blob/master/accelerator/real_accelerator.py#L182). This broke the test in `pyspark.conf`. After that, the real test failure was found when core change was triggered at https://github.com/apache/spark/commit/6f3b778e1a12901726c2a35072904f36f46f7888. In the PR, build passed because it was before https://github.com/apache/spark/commit/f223b8da9e23e4e028e145e0d4dd74eeae5d2d52 was merged. ### Does this PR introduce _any_ user-facing change? Technically yes. There might be some side effects by importing optional dependencies, and this PR avoid them. ### How was this patch tested? CI in this PR should verify it. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
