[
https://issues.apache.org/jira/browse/SPARK-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Felix Cheung updated SPARK-10346:
---------------------------------
Component/s: SparkR
> SparkR mutate and transform should replace column with same name to match R
> data.frame behavior
> -----------------------------------------------------------------------------------------------
>
> Key: SPARK-10346
> URL: https://issues.apache.org/jira/browse/SPARK-10346
> Project: Spark
> Issue Type: Bug
> Components: R, SparkR
> Affects Versions: 1.5.0
> Reporter: Felix Cheung
>
> Spark doesn't seem to replace existing column with the name in mutate (ie.
> mutate(df, age = df$age + 2) - returned DataFrame has 2 columns with the same
> name 'age'), so therefore not doing that for now in transform.
> Though it is clearly stated it should replace column with matching name:
> https://stat.ethz.ch/R-manual/R-devel/library/base/html/transform.html
> "The tags are matched against names(_data), and for those that match, the
> value replace the corresponding variable in _data, and the others are
> appended to _data."
> Also the resulting DataFrame might be hard to work with if one is to use
> select with column names, or to register the table to SQL, and so on, since
> then 2 columns have the same name.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]