[
https://issues.apache.org/jira/browse/SPARK-10894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035090#comment-15035090
]
Felix Cheung commented on SPARK-10894:
--------------------------------------
These seem like orthogonal things.
I don't know if it is less confusing to make `df["age"]` returning a DataFrame.
Though this should be very straightforward to do.
`df1 <- select(df, age > 10)` is nice but seem more do to with "select" method
and unrelated to the previous point. (SPARK-7499)
`df$age` or `iris$Petal.Width` are Column/atomic vector, they are not
DataFrame/data.frame.
Being able to collect/head or manipulate on `df$age` is also nice. (SPARK-9325)
Which is it that we want then?
> Add 'drop' support for DataFrame's subset function
> --------------------------------------------------
>
> Key: SPARK-10894
> URL: https://issues.apache.org/jira/browse/SPARK-10894
> Project: Spark
> Issue Type: Improvement
> Components: SparkR
> Reporter: Weiqiang Zhuang
>
> SparkR DataFrame can be subset to get one or more columns of the dataset. The
> current '[' implementation does not support 'drop' when is asked for just one
> column. This is not consistent with the R syntax:
> x[i, j, ... , drop = TRUE]
> # in R, when drop is FALSE, remain as data.frame
> > class(iris[, "Sepal.Width", drop=F])
> [1] "data.frame"
> # when drop is TRUE (default), drop to be a vector
> > class(iris[, "Sepal.Width", drop=T])
> [1] "numeric"
> > class(iris[,"Sepal.Width"])
> [1] "numeric"
> > df <- createDataFrame(sqlContext, iris)
> # in SparkR, 'drop' argument has no impact
> > class(df[,"Sepal_Width", drop=F])
> [1] "DataFrame"
> attr(,"package")
> [1] "SparkR"
> # should have dropped to be a Column class instead
> > class(df[,"Sepal_Width", drop=T])
> [1] "DataFrame"
> attr(,"package")
> [1] "SparkR"
> > class(df[,"Sepal_Width"])
> [1] "DataFrame"
> attr(,"package")
> [1] "SparkR"
> We should add the 'drop' support.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]