GitHub user neilalex opened a pull request:
https://github.com/apache/spark/pull/20352
[SPARK-21727][R] Allow multi-element atomic vector as column type in SparkR
DataFrame
## What changes were proposed in this pull request?
A fix to https://issues.apache.org/jira/browse/SPARK-21727, "Operating on
an ArrayType in a SparkR DataFrame throws error"
## How was this patch tested?
- Ran tests at R\pkg\tests\run-all.R (see below attached results)
- Tested the following lines in SparkR, which now seem to execute without
error:
```
indices <- 1:4
myDf <- data.frame(indices)
myDf$data <- list(rep(0, 20))
mySparkDf <- as.DataFrame(myDf)
collect(mySparkDf)
```
[2018-01-22 SPARK-21727 Test
Results.txt](https://github.com/apache/spark/files/1653535/2018-01-22.SPARK-21727.Test.Results.txt)
@felixcheung @yanboliang @sun-rui @shivaram
_The contribution is my original work and I license the work to the project
under the projectâs open source license_
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/neilalex/spark neilalex-sparkr-arraytype
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20352.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20352
commit 6bdf687511d19e26c9c5a23b32e1ac6b3ea763e4
Author: neilalex
Date: 2018-01-05T03:53:30Z
Check if an atomic R type is actually a vector of length > 1, and treat it
as an array if so.
commit f8ae69853b4529ad1c9b32fa705ab27426c52b66
Author: neilalex
Date: 2018-01-22T03:15:02Z
Use is.atomic(object) to check type
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org