Hello Everyone,

I assumed that doing a PCA on X is equivalent to performing a SVD  on
a mean-centered X.

For sklearn.pca the input matrix is of the shape, (n_samples,n_features).

When I perform a SVD on  a matrix X shaped (n_features,n_samples)
,some of the eigen vectors arent matching with the pca.components_


def sklearn_pca(X):
    pca =PCA(n_components=4)
    pca.fit(X)
    return pca.components_


def svd_pca(X):
    #every sample is a row
    X= X- np.mean(X,axis=0)
    u,s,v = np.linalg.svd(X,full_matrices=False)
    return u,s,v

def svd_t(X):
    #x.shape = n_features,n_samples
    #every sample is a column
    mean= np.mean(X,axis=1)
    mean = mean[:,np.newaxis]
    X = X -mean
    u,s,v = np.linalg.svd(X,full_matrices=False)
    return u,s,v

def test_svd_pca():
    rng = np.random.RandomState(1)
    X  = rng.randn(4,10)
    XT = np.copy(X)
    PX= np.copy(X)
    pca_comp = sklearn_pca(PX)
    u,s,v = svd_pca(X)
    ut,st,vt = svd_t(X.T)


I expected pca.components_ to be equivalent to v and vt.T. While its
same as v , I get  mismatches with vt.T

Is it right to expect vt.T to be same as pca.components_? If not
please help me clear my misunderstanding.

-- 
With Regards,
Deepak

------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to