[ 
https://issues.apache.org/jira/browse/SPARK-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370020#comment-15370020
 ] 

Dongjoon Hyun commented on SPARK-16466:
---------------------------------------

You can use like this. The following is the result of Spark 2.0.
{code}
> sdfCar <- createDataFrame(mtcars)
> names(sdfCar) <- c("mpg", "cyl", "disp", "hp", "drat", "wt", "qsec", "vs", 
> "am", "gear", "carb-count")
> sdfCar4 <- filter(sdfCar, sdfCar$"carb-count"==4)
> head(sdfCar4)
   mpg cyl  disp  hp drat    wt  qsec vs am gear carb-count
1 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4          4
2 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4          4
3 14.3   8 360.0 245 3.21 3.570 15.84  0  0    3          4
4 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4          4
5 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4          4
6 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3          4
{code}

> names() function allows creation of column name containing "-".  filter() 
> function subsequently fails
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-16466
>                 URL: https://issues.apache.org/jira/browse/SPARK-16466
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 1.6.1
>         Environment: Databricks.com
>            Reporter: Neil Dewar
>            Priority: Minor
>
> If I assign names to a DataFrame using the names() function, it allows the 
> introduction of "-" characters that caused the filter() function to 
> subsequently fail.  I am unclear if other special characters cause similar 
> problems.
> Example:
> sdfCar <- createDataFrame(sqlContext, mtcars)
> names(sdfCar) <- c("mpg", "cyl", "disp", "hp", "drat", "wt", "qsec", "vs", 
> "am", "gear", "carb-count") # note: carb renamed to carb-count
> sdfCar3 <- filter(sdfCar, carb-count==4)
> Above fails with error: failure: identifier expected carb-count==4.  This 
> logic appears to be assuming that the "-" in the column name is a minus sign.
> I am unsure if the problem here is that "-" is illegal in a column name, or 
> if the filter function should be able to handle "-" in a column name, but one 
> or the other must be wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to