[
https://issues.apache.org/jira/browse/SPARK-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370020#comment-15370020
]
Dongjoon Hyun commented on SPARK-16466:
---------------------------------------
You can use like this. The following is the result of Spark 2.0.
{code}
> sdfCar <- createDataFrame(mtcars)
> names(sdfCar) <- c("mpg", "cyl", "disp", "hp", "drat", "wt", "qsec", "vs",
> "am", "gear", "carb-count")
> sdfCar4 <- filter(sdfCar, sdfCar$"carb-count"==4)
> head(sdfCar4)
mpg cyl disp hp drat wt qsec vs am gear carb-count
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
4 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
5 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
6 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
{code}
> names() function allows creation of column name containing "-". filter()
> function subsequently fails
> -----------------------------------------------------------------------------------------------------
>
> Key: SPARK-16466
> URL: https://issues.apache.org/jira/browse/SPARK-16466
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Affects Versions: 1.6.1
> Environment: Databricks.com
> Reporter: Neil Dewar
> Priority: Minor
>
> If I assign names to a DataFrame using the names() function, it allows the
> introduction of "-" characters that caused the filter() function to
> subsequently fail. I am unclear if other special characters cause similar
> problems.
> Example:
> sdfCar <- createDataFrame(sqlContext, mtcars)
> names(sdfCar) <- c("mpg", "cyl", "disp", "hp", "drat", "wt", "qsec", "vs",
> "am", "gear", "carb-count") # note: carb renamed to carb-count
> sdfCar3 <- filter(sdfCar, carb-count==4)
> Above fails with error: failure: identifier expected carb-count==4. This
> logic appears to be assuming that the "-" in the column name is a minus sign.
> I am unsure if the problem here is that "-" is illegal in a column name, or
> if the filter function should be able to handle "-" in a column name, but one
> or the other must be wrong.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]