spark git commit: [SPARK-12638][API DOC] Parameter explanation not very accurate for rdd function "aggregate"

srowen Tue, 12 Jan 2016 05:21:13 -0800

Repository: spark
Updated Branches:
  refs/heads/branch-1.6 a6c9c68d8 -> 46fc7a12a



[SPARK-12638][API DOC] Parameter explanation not very accurate for rdd function 
"aggregate"

Currently, RDD function aggregate's parameter doesn't explain well, especially 
parameter "zeroValue".
It's helpful to let junior scala user know that "zeroValue" attend both "seqOp" 
and "combOp" phase.

Author: Tommy YU <tumm...@163.com>

Closes #10587 from Wenpei/rdd_aggregate_doc.

(cherry picked from commit 9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66)
Signed-off-by: Sean Owen <so...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/46fc7a12
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/46fc7a12
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/46fc7a12

Branch: refs/heads/branch-1.6
Commit: 46fc7a12a30b82cf1bcaab0e987a98b4dace37fe
Parents: a6c9c68
Author: Tommy YU <tumm...@163.com>
Authored: Tue Jan 12 13:20:04 2016 +0000
Committer: Sean Owen <so...@cloudera.com>
Committed: Tue Jan 12 13:20:29 2016 +0000

----------------------------------------------------------------------
 core/src/main/scala/org/apache/spark/rdd/RDD.scala | 14 ++++++++++++++
 1 file changed, 14 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/46fc7a12/core/src/main/scala/org/apache/spark/rdd/RDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index 9fe9d83..2fb8047 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -1071,6 +1071,13 @@ abstract class RDD[T: ClassTag](
    * apply the fold to each element sequentially in some defined ordering. For 
functions
    * that are not commutative, the result may differ from that of a fold 
applied to a
    * non-distributed collection.
+   *
+   * @param zeroValue the initial value for the accumulated result of each 
partition for the `op`
+   *                  operator, and also the initial value for the combine 
results from different
+   *                  partitions for the `op` operator - this will typically 
be the neutral
+   *                  element (e.g. `Nil` for list concatenation or `0` for 
summation)
+   * @param op an operator used to both accumulate results within a partition 
and combine results
+   *                  from different partitions
    */
   def fold(zeroValue: T)(op: (T, T) => T): T = withScope {
     // Clone the zero value since we will also be serializing it as part of 
tasks
@@ -1089,6 +1096,13 @@ abstract class RDD[T: ClassTag](
    * and one operation for merging two U's, as in scala.TraversableOnce. Both 
of these functions are
    * allowed to modify and return their first argument instead of creating a 
new U to avoid memory
    * allocation.
+   *
+   * @param zeroValue the initial value for the accumulated result of each 
partition for the
+   *                  `seqOp` operator, and also the initial value for the 
combine results from
+   *                  different partitions for the `combOp` operator - this 
will typically be the
+   *                  neutral element (e.g. `Nil` for list concatenation or 
`0` for summation)
+   * @param seqOp an operator used to accumulate results within a partition
+   * @param combOp an associative operator used to combine results from 
different partitions
    */
   def aggregate[U: ClassTag](zeroValue: U)(seqOp: (U, T) => U, combOp: (U, U) 
=> U): U = withScope {
     // Clone the zero value since we will also be serializing it as part of 
tasks


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-12638][API DOC] Parameter explanation not very accurate for rdd function "aggregate"

Reply via email to