HyukjinKwon commented on a change in pull request #29283:
URL: https://github.com/apache/spark/pull/29283#discussion_r462692911
##########
File path: docs/sparkr.md
##########
@@ -681,12 +681,12 @@ The current supported minimum version is 1.0.0; however,
this might change betwe
Arrow optimization is available when converting a Spark DataFrame to an R
DataFrame using the call `collect(spark_df)`,
when creating a Spark DataFrame from an R DataFrame with
`createDataFrame(r_df)`, when applying an R native function to each partition
via `dapply(...)` and when applying an R native function to grouped data via
`gapply(...)`.
-To use Arrow when executing these calls, users need to first set the Spark
configuration ‘spark.sql.execution.arrow.sparkr.enabled’
-to ‘true’. This is disabled by default.
+To use Arrow when executing these, users need to set the Spark configuration
‘spark.sql.execution.arrow.sparkr.enabled’
+to ‘true’ first. This is disabled by default.
-In addition, optimizations enabled by
‘spark.sql.execution.arrow.sparkr.enabled’ could fallback automatically to
non-Arrow optimization
-implementation if an error occurs before the actual computation within Spark
during converting a Spark DataFrame to/from an R
-DataFrame.
+Whether the optimization is enabled or not, SparkR produces the same results.
This is also because the conversion
Review comment:
Let me try to elabourate a bit more.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]