Hyukjin Kwon created SPARK-47068:
------------------------------------
Summary: Recover -1 and 0 case for
spark.sql.execution.arrow.maxRecordsPerBatch
Key: SPARK-47068
URL: https://issues.apache.org/jira/browse/SPARK-47068
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 3.5.0, 3.4.1, 4.0.0
Reporter: Hyukjin Kwon
{code}
import pandas as pd
spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")
spark.conf.set("spark.sql.execution.arrow.maxRecordsPerBatch", 0)
spark.conf.set("spark.sql.execution.arrow.pyspark.fallback.enabled", False)
spark.createDataFrame(pd.DataFrame({'a': [123]})).toPandas()
spark.conf.set("spark.sql.execution.arrow.maxRecordsPerBatch", -1)
spark.createDataFrame(pd.DataFrame({'a': [123]})).toPandas()
{code}
{code}
/.../spark/python/pyspark/sql/pandas/conversion.py:371: UserWarning:
createDataFrame attempted Arrow optimization because
'spark.sql.execution.arrow.pyspark.enabled' is set to true, but has reached the
error below and will not continue because automatic fallback with
'spark.sql.execution.arrow.pyspark.fallback.enabled' has been set to false.
range() arg 3 must not be zero
warn(msg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../spark/python/pyspark/sql/session.py", line 1483, in createDataFrame
return super(SparkSession, self).createDataFrame( # type:
ignore[call-overload]
File "/.../spark/python/pyspark/sql/pandas/conversion.py", line 351, in
createDataFrame
return self._create_from_pandas_with_arrow(data, schema, timezone)
File "/.../spark/python/pyspark/sql/pandas/conversion.py", line 633, in
_create_from_pandas_with_arrow
pdf_slices = (pdf.iloc[start : start + step] for start in range(0,
len(pdf), step))
ValueError: range() arg 3 must not be zero
{code}
{code}
Empty DataFrame
Columns: [a]
Index: []
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]