Felix Cheung created SPARK-10346:
------------------------------------

             Summary: SparkR mutate and transform should replace column with 
same name to match R data.frame behavior
                 Key: SPARK-10346
                 URL: https://issues.apache.org/jira/browse/SPARK-10346
             Project: Spark
          Issue Type: Bug
          Components: R
    Affects Versions: 1.5.0
            Reporter: Felix Cheung


Spark doesn't seem to replace existing column with the name in mutate (ie. 
mutate(df, age = df$age + 2) - returned DataFrame has 2 columns with the same 
name 'age'), so therefore not doing that for now in transform.

Though it is clearly stated it should replace column with matching name:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/transform.html

"The tags are matched against names(_data), and for those that match, the value 
replace the corresponding variable in _data, and the others are appended to 
_data."

Also the resulting DataFrame might be hard to work with if one is to use select 
with column names and so on.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to