chris snow created SPARK-11658:
----------------------------------
Summary: simplify documentation for PySpark combineByKey
Key: SPARK-11658
URL: https://issues.apache.org/jira/browse/SPARK-11658
Project: Spark
Issue Type: Improvement
Components: Documentation, PySpark
Affects Versions: 1.5.1
Reporter: chris snow
Priority: Minor
The current documentation for combineByKey looks like this:
{code}
>>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)])
>>> def f(x): return x
>>> def add(a, b): return a + str(b)
>>> sorted(x.combineByKey(str, add, add).collect())
[('a', '11'), ('b', '1')]
"""
{code}
I think it could be simplified to:
{code}
>>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)])
>>> def add(a, b): return a + str(b)
>>> x.combineByKey(str, add, add).collect()
[('a', '11'), ('b', '1')]
"""
{code}
I'll shortly add a patch for this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]