Felix Cheung created SPARK-10346: ------------------------------------ Summary: SparkR mutate and transform should replace column with same name to match R data.frame behavior Key: SPARK-10346 URL: https://issues.apache.org/jira/browse/SPARK-10346 Project: Spark Issue Type: Bug Components: R Affects Versions: 1.5.0 Reporter: Felix Cheung
Spark doesn't seem to replace existing column with the name in mutate (ie. mutate(df, age = df$age + 2) - returned DataFrame has 2 columns with the same name 'age'), so therefore not doing that for now in transform. Though it is clearly stated it should replace column with matching name: https://stat.ethz.ch/R-manual/R-devel/library/base/html/transform.html "The tags are matched against names(_data), and for those that match, the value replace the corresponding variable in _data, and the others are appended to _data." Also the resulting DataFrame might be hard to work with if one is to use select with column names and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org