Repository: spark Updated Branches: refs/heads/master 7d399c9da -> c1ad373f2
[SPARK-10782] [PYTHON] Update dropDuplicates documentation Documentation for dropDuplicates() and drop_duplicates() is one and the same. Resolved the error in the example for drop_duplicates using the same approach used for groupby and groupBy, by indicating that dropDuplicates and drop_duplicates are aliases. Author: asokadiggs <[email protected]> Closes #8930 from asokadiggs/jira-10782. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c1ad373f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c1ad373f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c1ad373f Branch: refs/heads/master Commit: c1ad373f26053e1906fce7681c03d130a642bf33 Parents: 7d399c9 Author: asokadiggs <[email protected]> Authored: Tue Sep 29 17:45:18 2015 -0400 Committer: Sean Owen <[email protected]> Committed: Tue Sep 29 17:45:18 2015 -0400 ---------------------------------------------------------------------- python/pyspark/sql/dataframe.py | 2 ++ 1 file changed, 2 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/c1ad373f/python/pyspark/sql/dataframe.py ---------------------------------------------------------------------- diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py index b09422a..033b319 100644 --- a/python/pyspark/sql/dataframe.py +++ b/python/pyspark/sql/dataframe.py @@ -931,6 +931,8 @@ class DataFrame(object): """Return a new :class:`DataFrame` with duplicate rows removed, optionally only considering certain columns. + :func:`drop_duplicates` is an alias for :func:`dropDuplicates`. + >>> from pyspark.sql import Row >>> df = sc.parallelize([ \ Row(name='Alice', age=5, height=80), \ --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
