nchammas opened a new pull request #27376: [SPARK-30665][DOCS][BUILD][PYTHON] Eliminate pypandoc dependency URL: https://github.com/apache/spark/pull/27376 ### What changes were proposed in this pull request? This PR removes any dependencies on pypandoc. It also makes related tweaks to the docs README to clarify the dependency on pandoc (not pypandoc). ### Why are the changes needed? We are using pypandoc to convert the Spark README from Markdown to ReST for PyPI. PyPI now natively supports Markdown, so we don't need pypandoc anymore. The dependency on pypandoc also sometimes causes issues when installing Python packages that depend on PySpark, as described in #18981. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manually: ```sh python -m venv venv source venv/bin/activate pip install -U pip cd python/ python setup.py sdist pip install dist/pyspark-3.0.0.dev0.tar.gz pyspark --version ``` I also built the PySpark and R API docs with `jekyll` and reviewed them locally. It would be good if a maintainer could also test this by creating a PySpark distribution and uploading it to [Test PyPI](https://test.pypi.org) to confirm the README looks as it should.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
