Oh I see, I think you're trying to do something like (in SQL): SELECT order, mean(price) FROM orders GROUP BY order
In this case, I'm not aware of a way to use the DoubleRDDFunctions, since you have a single RDD of pairs where each pair is of type (KeyType, Iterable[Double]). It seems to me that you want to write a function: def stats(numList: Iterable[Double]): org.apache.spark.util.StatCounter and then use pairRdd.mapValues( value => stats(value) ) On Fri, Sep 12, 2014 at 5:05 PM, rzykov <rzy...@gmail.com> wrote: > Tried this: > > ordersRDD.join(ordersRDD).map{case((partnerid, itemid),((matchedida, > pricea), (matchedidb, priceb))) => ((matchedida, matchedidb), (if(priceb > > 0) (pricea/priceb).toDouble else 0.toDouble))} > .groupByKey > .values.stats > .first > > Error: > <console>:37: error: could not find implicit value for parameter num: > Numeric[Iterable[Double]] > .values.stats > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14065.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >