GitHub user saurfang opened a pull request:
https://github.com/apache/spark/pull/10481
[SPARK-12526][SPARKR]`ifelse`, `when`, `otherwise` unable to take Column as
value
`ifelse`, `when`, `otherwise` is unable to take `Column` typed S4 object as
values.
For example:
```r
ifelse(lit(1) == lit(1), lit(2), lit(3))
ifelse(df$mpg > 0, df$mpg, 0)
```
will both fail with
```r
attempt to replicate an object of type 'environment'
```
The PR replaces `ifelse` calls with `if ... else ...` inside the function
implementations to avoid attempt to vectorize(i.e. `rep()`). It remains to be
discussed whether we should instead support vectorization in these functions
for consistency because `ifelse` in base R is vectorized but I cannot foresee
any scenarios these functions will want to be vectorized in SparkR.
For reference, added test cases which trigger failures:
```r
. Error: when(), otherwise() and ifelse() with column on a DataFrame
----------
error in evaluating the argument 'x' in selecting a method for function
'collect':
error in evaluating the argument 'col' in selecting a method for function
'select':
attempt to replicate an object of type 'environment'
Calls: when -> when -> ifelse -> ifelse
1: withCallingHandlers(eval(code, new_test_environment), error =
capture_calls, message = function(c) invokeRestart("muffleMessage"))
2: eval(code, new_test_environment)
3: eval(expr, envir, enclos)
4: expect_equal(collect(select(df, when(df$a > 1 & df$b > 2, lit(1))))[,
1], c(NA, 1)) at test_sparkSQL.R:1126
5: expect_that(object, equals(expected, label = expected.label, ...), info
= info, label = label)
6: condition(object)
7: compare(actual, expected, ...)
8: collect(select(df, when(df$a > 1 & df$b > 2, lit(1))))
Error: Test failures
Execution halted
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/saurfang/spark spark-12526
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/10481.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #10481
----
commit 449b0f6074d08309ad1e3fa6a6611b2fb6a33a5e
Author: Forest Fang <[email protected]>
Date: 2015-12-26T05:21:18Z
replace ifelse with if...else... to avoid vectorization
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]