[ https://issues.apache.org/jira/browse/SPARK-31341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-31341. ---------------------------------- Resolution: Cannot Reproduce It's fixed in 3.0. > Spark documentation incorrectly claims 3.8 compatibility > -------------------------------------------------------- > > Key: SPARK-31341 > URL: https://issues.apache.org/jira/browse/SPARK-31341 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.4.5 > Reporter: Daniel King > Priority: Major > > The Spark documentation ([https://spark.apache.org/docs/latest/]) has this > text: > {quote}Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, > Spark 2.4.5 uses Scala 2.12. You will need to use a compatible Scala version > (2.12.x). > {quote} > Which suggests that Spark is compatible with Python 3.8. This is not true. > For example in the latest ubuntu:18.04 docker image: > > {code:python} > apt-get update > apt-get install python3.8 python3-pip > pip3 install pyspark > python3.8 -m pip install pyspark > python3.8 -c 'import pyspark' > {code} > Outputs: > {code:python} > Traceback (most recent call last): > File "<string>", line 1, in <module> > File "/usr/local/lib/python3.8/dist-packages/pyspark/__init__.py", line 51, > in <module> > from pyspark.context import SparkContext > File "/usr/local/lib/python3.8/dist-packages/pyspark/context.py", line 31, > in <module> > from pyspark import accumulators > File "/usr/local/lib/python3.8/dist-packages/pyspark/accumulators.py", line > 97, in <module> > from pyspark.serializers import read_int, PickleSerializer > File "/usr/local/lib/python3.8/dist-packages/pyspark/serializers.py", line > 72, in <module> > from pyspark import cloudpickle > File "/usr/local/lib/python3.8/dist-packages/pyspark/cloudpickle.py", line > 145, in <module> > _cell_set_template_code = _make_cell_set_template_code() > File "/usr/local/lib/python3.8/dist-packages/pyspark/cloudpickle.py", line > 126, in _make_cell_set_template_code > return types.CodeType( > TypeError: an integer is required (got type bytes) > {code} > I propose the documentation is updated to say "Python 3.4 to 3.7". I also > propose the `setup.py` file for pyspark include: > {code:python} > python_requires=">=3.6,<3.8", > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org