Github user ceedubs commented on a diff in the pull request: https://github.com/apache/spark/pull/21971#discussion_r207246356 --- Diff: core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala --- @@ -61,6 +62,36 @@ class AsyncRDDActions[T: ClassTag](self: RDD[T]) extends Serializable with Loggi (index, data) => results(index) = data, results.flatten.toSeq) } + + /** + * Returns a future of an aggregation across the RDD. + * + * @see [[RDD.aggregate]] which is the synchronous version of this method. + */ + def aggregateAsync[U](zeroValue: U)(seqOp: (U, T) => U, combOp: (U, U) => U): FutureAction[U] = + self.withScope { --- End diff -- In the synchronous version of `aggregate`, the `zeroValue` is cloned, which requires adding an implicit `ClassTag[U]` argument. I didn't really understand the motivation for that, so I didn't do it here, but I was hoping that someone who understood the cloning could let me know here.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org