viirya commented on a change in pull request #24700: [SPARK-27834][SQL][R][PYTHON] Make separate PySpark/SparkR vectorization configurations URL: https://github.com/apache/spark/pull/24700#discussion_r287820143
########## File path: docs/sql-pyspark-pandas-with-arrow.md ########## @@ -44,9 +44,9 @@ You can install using pip or conda from the conda-forge channel. See PyArrow Arrow is available as an optimization when converting a Spark DataFrame to a Pandas DataFrame using the call `toPandas()` and when creating a Spark DataFrame from a Pandas DataFrame with `createDataFrame(pandas_df)`. To use Arrow when executing these calls, users need to first set -the Spark configuration 'spark.sql.execution.arrow.enabled' to 'true'. This is disabled by default. +the Spark configuration 'spark.sql.pyspark.execution.arrow.enabled' to 'true'. This is disabled by default. -In addition, optimizations enabled by 'spark.sql.execution.arrow.enabled' could fallback automatically +In addition, optimizations enabled by 'spark.sql.pyspark.execution.arrow.enabled' could fallback automatically to non-Arrow optimization implementation if an error occurs before the actual computation within Spark. This can be controlled by 'spark.sql.execution.arrow.fallback.enabled'. Review comment: `spark.sql.execution.arrow.fallback.enabled` -> `spark.sql.pyspark.execution.arrow.fallback.enabled`? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
