Repository: spark
Updated Branches:
  refs/heads/master 6847e93cf -> 0fcde87aa


[SPARK-21658][SQL][PYSPARK] Add default None for value in na.replace in PySpark

## What changes were proposed in this pull request?
JIRA issue: https://issues.apache.org/jira/browse/SPARK-21658

Add default None for value in `na.replace` since `Dataframe.replace` and 
`DataframeNaFunctions.replace` are alias.

The default values are the same now.
```
>>> df = sqlContext.createDataFrame([('Alice', 10, 80.0)])
>>> df.replace({"Alice": "a"}).first()
Row(_1=u'a', _2=10, _3=80.0)
>>> df.na.replace({"Alice": "a"}).first()
Row(_1=u'a', _2=10, _3=80.0)
```

## How was this patch tested?
Existing tests.

cc viirya

Author: byakuinss <grace.chinha...@gmail.com>

Closes #18895 from byakuinss/SPARK-21658.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0fcde87a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0fcde87a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0fcde87a

Branch: refs/heads/master
Commit: 0fcde87aadc9a92e138f11583119465ca4b5c518
Parents: 6847e93
Author: byakuinss <grace.chinha...@gmail.com>
Authored: Tue Aug 15 00:41:01 2017 +0900
Committer: hyukjinkwon <gurwls...@gmail.com>
Committed: Tue Aug 15 00:41:01 2017 +0900

----------------------------------------------------------------------
 python/pyspark/sql/dataframe.py | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/0fcde87a/python/pyspark/sql/dataframe.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index edc7ca6..5cd208b 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -1403,6 +1403,16 @@ class DataFrame(object):
         |null|  null|null|
         +----+------+----+
 
+        >>> df4.na.replace('Alice').show()
+        +----+------+----+
+        | age|height|name|
+        +----+------+----+
+        |  10|    80|null|
+        |   5|  null| Bob|
+        |null|  null| Tom|
+        |null|  null|null|
+        +----+------+----+
+
         >>> df4.na.replace(['Alice', 'Bob'], ['A', 'B'], 'name').show()
         +----+------+----+
         | age|height|name|
@@ -1837,7 +1847,7 @@ class DataFrameNaFunctions(object):
 
     fill.__doc__ = DataFrame.fillna.__doc__
 
-    def replace(self, to_replace, value, subset=None):
+    def replace(self, to_replace, value=None, subset=None):
         return self.df.replace(to_replace, value, subset)
 
     replace.__doc__ = DataFrame.replace.__doc__


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to