[
https://issues.apache.org/jira/browse/SPARK-33795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251515#comment-17251515
]
Hyukjin Kwon commented on SPARK-33795:
--------------------------------------
[~n8shdw] can you check if this happens in Apache Spark instead of Databricks
Runtime?
> gapply fails execution with rbind error
> ---------------------------------------
>
> Key: SPARK-33795
> URL: https://issues.apache.org/jira/browse/SPARK-33795
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Affects Versions: 3.0.0
> Environment: Databricks runtime 7.3 LTS ML
> Reporter: MvR
> Priority: Major
> Attachments: Rerror.log
>
>
> Executing following code on databricks runtime 7.3 LTS ML errors out showing
> some rbind error whereas it is successfully executed without enabling Arrow
> in Spark session. Full error message attached.
>
> ```
> library(dplyr)
> library(SparkR)
> SparkR::sparkR.session(sparkConfig =
> list(spark.sql.execution.arrow.sparkr.enabled = "true"))
> mtcars %>%
> SparkR::as.DataFrame() %>%
> SparkR::gapply(x = .,
> cols = c("cyl", "vs"),
>
> func = function(key,
> data){
>
> dt <- data[,c("mpg", "qsec")]
> res <- apply(dt, 2, mean)
> df <- data.frame(firstGroupKey = key[1],
> secondGroupKey = key[2],
> mean_mpg = res[1],
> mean_cyl = res[2])
> return(df)
>
> },
> schema = structType(structField("cyl", "double"),
> structField("vs", "double"),
> structField("mpg_mean", "double"),
> structField("qsec_mean", "double"))
> ) %>%
> display()
> ```
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]