Re: [scala-user] Why aggregate is inconsistent?

Xuefeng Wu Thu, 30 Oct 2014 19:50:09 -0700

My other question is that Spark why not provide foldLeft:
*def foldLeft[U](zeroValue: U)(op: (U, T) => T): U *but aggregate.
the *def fold(zeroValue: T)(op: (T, T) => T): T* in spark is not
deterministic too.




On Thu, Oct 30, 2014 at 3:50 PM, Jason Zaugg <jza...@gmail.com> wrote:

> On Thu, Oct 30, 2014 at 5:39 PM, Xuefeng Wu <ben...@gmail.com> wrote:
>
>> scala> import scala.collection.GenSeq
>> scala> val seq = GenSeq("This", "is", "an", "example")
>>
>> scala> seq.aggregate("0")(_ + _, _ + _)
>> res0: String = 0Thisisanexample
>>
>> scala> seq.par.aggregate("0")(_ + _, _ + _)
>> res1: String = 0This0is0an0example
>>
>   /** Aggregates the results of applying an operator to subsequent elements.
>    *
>    *  This is a more general form of `fold` and `reduce`. It has similar
>    *  semantics, but does not require the result to be a supertype of the
>    *  element type. It traverses the elements in different partitions
>    *  sequentially, using `seqop` to update the result, and then applies
>    *  `combop` to results from different partitions. The implementation of
>    *  this operation may operate on an arbitrary number of collection
>    *  partitions, so `combop` may be invoked an arbitrary number of times.
>
> ...
>    *  @tparam B        the type of accumulated results
>    *  @param z         the initial value for the accumulated result of the 
> partition - this
>    *                   will typically be the neutral element for the `seqop` 
> operator (e.g.
>    *                   `Nil` for list concatenation or `0` for summation) and 
> may be evaluated
>    *                   more than once
>    *  @param seqop     an operator used to accumulate results within a 
> partition
>    *  @param combop    an associative operator used to combine results from 
> different partitions
>    */
>   def aggregate[B](z: =>B)(seqop: (B, A) => B, combop: (B, B) => B): B
>
> The contract of aggregate allows for this, if you need deterministic
> results you need to choose z that is a the nuetral element for combop. In
> your example, this would be the empty string.
> 
> -jason
>
>


-- 

~Yours, Xuefeng Wu/吴雪峰  敬上

Re: [scala-user] Why aggregate is inconsistent?

Reply via email to