Repository: spark Updated Branches: refs/heads/master 6847e93cf -> 0fcde87aa
[SPARK-21658][SQL][PYSPARK] Add default None for value in na.replace in PySpark ## What changes were proposed in this pull request? JIRA issue: https://issues.apache.org/jira/browse/SPARK-21658 Add default None for value in `na.replace` since `Dataframe.replace` and `DataframeNaFunctions.replace` are alias. The default values are the same now. ``` >>> df = sqlContext.createDataFrame([('Alice', 10, 80.0)]) >>> df.replace({"Alice": "a"}).first() Row(_1=u'a', _2=10, _3=80.0) >>> df.na.replace({"Alice": "a"}).first() Row(_1=u'a', _2=10, _3=80.0) ``` ## How was this patch tested? Existing tests. cc viirya Author: byakuinss <grace.chinha...@gmail.com> Closes #18895 from byakuinss/SPARK-21658. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0fcde87a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0fcde87a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0fcde87a Branch: refs/heads/master Commit: 0fcde87aadc9a92e138f11583119465ca4b5c518 Parents: 6847e93 Author: byakuinss <grace.chinha...@gmail.com> Authored: Tue Aug 15 00:41:01 2017 +0900 Committer: hyukjinkwon <gurwls...@gmail.com> Committed: Tue Aug 15 00:41:01 2017 +0900 ---------------------------------------------------------------------- python/pyspark/sql/dataframe.py | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/0fcde87a/python/pyspark/sql/dataframe.py ---------------------------------------------------------------------- diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py index edc7ca6..5cd208b 100644 --- a/python/pyspark/sql/dataframe.py +++ b/python/pyspark/sql/dataframe.py @@ -1403,6 +1403,16 @@ class DataFrame(object): |null| null|null| +----+------+----+ + >>> df4.na.replace('Alice').show() + +----+------+----+ + | age|height|name| + +----+------+----+ + | 10| 80|null| + | 5| null| Bob| + |null| null| Tom| + |null| null|null| + +----+------+----+ + >>> df4.na.replace(['Alice', 'Bob'], ['A', 'B'], 'name').show() +----+------+----+ | age|height|name| @@ -1837,7 +1847,7 @@ class DataFrameNaFunctions(object): fill.__doc__ = DataFrame.fillna.__doc__ - def replace(self, to_replace, value, subset=None): + def replace(self, to_replace, value=None, subset=None): return self.df.replace(to_replace, value, subset) replace.__doc__ = DataFrame.replace.__doc__ --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org