Hi Srinath,

I asked from Spark user group and answer was:

"The Julia code is computing the SVD of the Gram matrix. PCA should be
applied to the covariance matrix. -Xiangrui"

"You need to subtract mean values to obtain the covariance matrix
(http://en.wikipedia.org/wiki/Covariance_matrix) -Xiangrui"

So,  there was a small bug in calculating  Covariance matrix in my Julia
code,
and I fixed it. Now, PCA calculated using R and Julia is identical, but
still I can see a small
difference between PCA  values given by Spark and other two.

I will further investigate this and update you.

R Code:
-------------
data <- read.csv('/home/upul/Desktop/iris.csv');
X <- data[,1:4]
pca <- prcomp(X, center = TRUE, scale=FALSE)
transformed <- predict(pca, newdata = X)

Julia Code (Fixed)
--------------
data = readcsv("/home/upul/temp/iris.csv");
X = data[:,1:end-1];
meanX = mean(X,1);
m,n = size(X);
X = X - repmat(x, m,1);
u,s,v = svd(X);
transformed =  X*v;


Thanks,
Upul

On Thu, Jan 8, 2015 at 7:04 PM, Srinath Perera <[email protected]> wrote:

> And very good job paying attention to detail and double checking Upul!
>
> On Thu, Jan 8, 2015 at 7:04 PM, Srinath Perera <[email protected]> wrote:
>
>> pls send this mails to mailing list .. dev@
>>
>> write to Spark list and ask why.
>>
>> --Srinath
>>
>> On Wed, Jan 7, 2015 at 3:09 PM, Upul Bandara <[email protected]> wrote:
>>
>>> Hi Srinath,
>>>
>>> Attached, please find the scatter plots of PCA
>>> (using Spark API and Julia) for the irish dataset.
>>>
>>> Thanks,
>>> Upul
>>>
>>> --
>>> Upul Bandara,
>>> Mob: +94 715 468 345.
>>>
>>
>>
>>
>> --
>> ============================
>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>> Site: http://people.apache.org/~hemapani/
>> Photos: http://www.flickr.com/photos/hemapani/
>> Phone: 0772360902
>>
>
>
>
> --
> ============================
> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
> Site: http://people.apache.org/~hemapani/
> Photos: http://www.flickr.com/photos/hemapani/
> Phone: 0772360902
>



-- 
Upul Bandara,
Associate Technical Lead, WSO2, Inc.,
Mob: +94 715 468 345.
_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to