MvR created ARROW-10916:
---------------------------

             Summary: gapply fails executing with rbind error
                 Key: ARROW-10916
                 URL: https://issues.apache.org/jira/browse/ARROW-10916
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 2.0.0
         Environment: Databricks runtime 7.3 LTS ML
            Reporter: MvR
         Attachments: Rerror.log

Executing following code on databricks runtime 7.3 LTS ML errors out showing 
some rbind error whereas it is successfully executed without enabling Arrow in 
Spark session. Full error message attached.

 

```

library(dplyr)
library(SparkR)

SparkR::sparkR.session(sparkConfig = 
list(spark.sql.execution.arrow.sparkr.enabled = "true"))

mtcars %>%
 SparkR::as.DataFrame() %>%

SparkR::gapply(x = .,
 cols = c("cyl", "vs"),
 
 func = function(key,
 data){
 
 dt <- data[,c("mpg", "qsec")]
 res <- apply(dt, 2, mean)
 df <- data.frame(firstGroupKey = key[1],
 secondGroupKey = key[2],
 mean_mpg = res[1],
 mean_cyl = res[2])
 return(df)
 
 }, 
 schema = structType(structField("cyl", "double"),
 structField("vs", "double"),
 structField("mpg_mean", "double"),
 structField("qsec_mean", "double"))
 ) %>%
 display()

```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to