Neil Dewar created SPARK-16466:
----------------------------------
Summary: names() function allows creation of column name
containing "-". filter() function subsequently fails
Key: SPARK-16466
URL: https://issues.apache.org/jira/browse/SPARK-16466
Project: Spark
Issue Type: Bug
Components: SparkR
Affects Versions: 1.6.1
Environment: Databricks.com
Reporter: Neil Dewar
Priority: Minor
If I assign names to a DataFrame using the names() function, it allows the
introduction of "-" characters that caused the filter() function to
subsequently fail. I am unclear if other special characters cause similar
problems.
Example:
sdfCar <- createDataFrame(sqlContext, mtcars)
names(sdfCar) <- c("mpg", "cyl", "disp", "hp", "drat", "wt", "qsec", "vs",
"am", "gear", "carb-count") # note: carb renamed to carb-count
sdfCar3 <- filter(sdfCar, carb-count==4)
Above fails with error: failure: identifier expected carb-count==4. This logic
appears to be assuming that the "-" in the column name is a minus sign.
I am unsure if the problem here is that "-" is illegal in a column name, or if
the filter function should be able to handle "-" in a column name, but one or
the other must be wrong.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]