GitHub user mn-mikke opened a pull request:
https://github.com/apache/spark/pull/21434
[SPARK-24331][SparkR][SQL] Adding arrays_overlap, array_repeat, map_entries
to SparkR
## What changes were proposed in this pull request?
The PR adds functions `arrays_overlap`, `array_repeat`, `map_entries` to
SparkR.
## How was this patch tested?
Tests added into R/pkg/tests/fulltests/test_sparkSQL.R
## Examples
### arrays_overlap
```
df <- createDataFrame(list(list(list(1L, 2L), list(3L, 1L)),
list(list(1L, 2L), list(3L, 4L)),
list(list(1L, NA), list(3L, 4L))))
collect(select(df, arrays_overlap(df[[1]], df[[2]])))
```
```
arrays_overlap(_1, _2)
1 TRUE
2 FALSE
3 NA
```
### array_repeat
```
df <- createDataFrame(list(list("a", 3L), list("b", 2L)))
collect(select(df, array_repeat(df[[1]], df[[2]])))
```
```
array_repeat(_1, _2)
1 a, a, a
2 b, b
```
### map_entries
```
df <- createDataFrame(list(list(map = as.environment(list(x = 1, y = 2)))))
collect(select(df, map_entries(df$map)))
```
```
map_entries(map)
1 x, 1, y, 2
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mn-mikke/spark SPARK-24331
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21434.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21434
----
commit 5d80ad669db4a89089378716fdf5d8258987bd97
Author: Marek Novotny <mn.mikke@...>
Date: 2018-05-25T16:30:50Z
[SPARK-24331][SparkR][SQL] Adding functions arrays_overlap, array_repeat,
map_entries to SparkR
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]