[GitHub] [spark] HyukjinKwon commented on a change in pull request #29283: [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization

GitBox Thu, 30 Jul 2020 00:35:25 -0700


HyukjinKwon commented on a change in pull request #29283:
URL: https://github.com/apache/spark/pull/29283#discussion_r462692911




##########
File path: docs/sparkr.md
##########
@@ -681,12 +681,12 @@ The current supported minimum version is 1.0.0; however, 
this might change betwe
 Arrow optimization is available when converting a Spark DataFrame to an R 
DataFrame using the call `collect(spark_df)`,
 when creating a Spark DataFrame from an R DataFrame with 
`createDataFrame(r_df)`, when applying an R native function to each partition
 via `dapply(...)` and when applying an R native function to grouped data via 
`gapply(...)`.
-To use Arrow when executing these calls, users need to first set the Spark 
configuration ‘spark.sql.execution.arrow.sparkr.enabled’
-to ‘true’. This is disabled by default.
+To use Arrow when executing these, users need to set the Spark configuration 
‘spark.sql.execution.arrow.sparkr.enabled’
+to ‘true’ first. This is disabled by default.
 
-In addition, optimizations enabled by 
‘spark.sql.execution.arrow.sparkr.enabled’ could fallback automatically to 
non-Arrow optimization
-implementation if an error occurs before the actual computation within Spark 
during converting a Spark DataFrame to/from an R
-DataFrame.
+Whether the optimization is enabled or not, SparkR produces the same results. 
This is also because the conversion

Review comment:
       Let me try to elabourate a bit more.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29283: [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization

Reply via email to