[
https://issues.apache.org/jira/browse/SPARK-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529930#comment-14529930
]
Sun Rui commented on SPARK-6812:
--------------------------------
Interestingly, we have a unit test case for filter() and the test passes. In R,
if multiple packages have a same name, the name in the package loaded lastly
overwrites that in the packages loaded before.
If you use bin/sparkR to start a SparkR shell, the environment list is as
follows:
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:datasets" "package:SparkR"
[7] "package:utils" "package:methods" "Autoloads"
[10] "package:base"
You can see that "package:stats" is before "package:SparkR", so its filter()
function overwrites the one in SparkR.
While in the test procedure, the environment list is different:
.GlobalEnv package:plyr package:SparkR package:testthat package:methods
package:stats package:graphics package:grDevices package:utils package:datasets
Autoloads package:base
You can see that package:SparkR is before package:stats. That why filter() in
SparkR passes the test.
Don't know why the package loading order is different now.
> filter() on DataFrame does not work as expected
> -----------------------------------------------
>
> Key: SPARK-6812
> URL: https://issues.apache.org/jira/browse/SPARK-6812
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Reporter: Davies Liu
> Assignee: Sun Rui
> Priority: Blocker
>
> {code}
> > filter(df, df$age > 21)
> Error in filter(df, df$age > 21) :
> no method for coercing this S4 class to a vector
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]