Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/15889#discussion_r88679304
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -57,15 +57,18 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
* :: Experimental ::
* Generic function to combine the elements for each key using a custom
set of aggregation
* functions. Turns an RDD[(K, V)] into a result of type RDD[(K, C)],
for a "combined type" C
- * Note that V and C can be different -- for example, one might group an
RDD of type
- * (Int, Int) into an RDD of type (Int, Seq[Int]). Users provide three
functions:
+ *
+ * Users provide three functions:
*
* - `createCombiner`, which turns a V into a C (e.g., creates a
one-element list)
* - `mergeValue`, to merge a V into a C (e.g., adds it to the end of a
list)
* - `mergeCombiners`, to combine two C's into a single one.
*
* In addition, users can control the partitioning of the output RDD,
and whether to perform
* map-side aggregation (if a mapper can produce multiple items with the
same key).
+ *
+ * @note V and C can be different -- for example, one might group a RDD
of type
--- End diff --
Ah... I see. Sure. I thought "a RDD" is correct but just realised it after
googling it.. Will fix and replace. I see. It should be an
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]