The Julia code is computing the SVD of the Gram matrix. PCA should be applied to the covariance matrix. -Xiangrui
On Thu, Jan 8, 2015 at 8:27 AM, Upul Bandara <upulband...@gmail.com> wrote: > Hi All, > > I tried to do PCA for the Iris dataset > [https://archive.ics.uci.edu/ml/datasets/Iris] using MLLib > [http://spark.apache.org/docs/1.1.1/mllib-dimensionality-reduction.html]. > Also, PCA was calculated in Julia using following method: > > Sigma = (1/numRow(X))*X'*X ; > [U, S, V] = svd(Sigma); > Ureduced = U(:, 1:k); > Z = X*Ureduced; > > However, I'm seeing a little difference between values given by MLLib and > the method shown above . > > Does anyone have any idea about this difference? > > Additionally, I have attached two visualizations, related to two approaches. > > Thanks, > Upul > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org