Re: Re: how to use DoubleRDDFunctions on mllib Vector?

2015-07-09 Thread 诺铁
Ok, got it , thanks. On Thu, Jul 9, 2015 at 12:02 PM, prosp4300 wrote: > > > Seems what Feynman mentioned is the source code instead of documentation, > vectorMean is private, see > > https://github.com/apache/spark/blob/v1.3.0/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtu

回复:Re: how to use DoubleRDDFunctions on mllib Vector?

2015-07-08 Thread prosp4300
Seems what Feynman mentioned is the source code instead of documentation, vectorMean is private, see https://github.com/apache/spark/blob/v1.3.0/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala At 2015-07-09 10:10:58, "诺铁" wrote: thanks, I understand now. but I c

Re: how to use DoubleRDDFunctions on mllib Vector?

2015-07-08 Thread 诺铁
thanks, I understand now. but I can't find mllib.clustering.GaussianMixture#vectorMean , what version of spark do you use? On Thu, Jul 9, 2015 at 1:16 AM, Feynman Liang wrote: > A RDD[Double] is an abstraction for a large collection of doubles, > possibly distributed across multiple nodes. The

Re: how to use DoubleRDDFunctions on mllib Vector?

2015-07-08 Thread Feynman Liang
A RDD[Double] is an abstraction for a large collection of doubles, possibly distributed across multiple nodes. The DoubleRDDFunctions are there for performing mean and variance calculations across this distributed dataset. In contrast, a Vector is not distributed and fits on your local machine. Yo