Hello Everyone,
I assumed that doing a PCA on X is equivalent to performing a SVD on
a mean-centered X.
For sklearn.pca the input matrix is of the shape, (n_samples,n_features).
When I perform a SVD on a matrix X shaped (n_features,n_samples)
,some of the eigen vectors arent matching with the pca.components_
def sklearn_pca(X):
pca =PCA(n_components=4)
pca.fit(X)
return pca.components_
def svd_pca(X):
#every sample is a row
X= X- np.mean(X,axis=0)
u,s,v = np.linalg.svd(X,full_matrices=False)
return u,s,v
def svd_t(X):
#x.shape = n_features,n_samples
#every sample is a column
mean= np.mean(X,axis=1)
mean = mean[:,np.newaxis]
X = X -mean
u,s,v = np.linalg.svd(X,full_matrices=False)
return u,s,v
def test_svd_pca():
rng = np.random.RandomState(1)
X = rng.randn(4,10)
XT = np.copy(X)
PX= np.copy(X)
pca_comp = sklearn_pca(PX)
u,s,v = svd_pca(X)
ut,st,vt = svd_t(X.T)
I expected pca.components_ to be equivalent to v and vt.T. While its
same as v , I get mismatches with vt.T
Is it right to expect vt.T to be same as pca.components_? If not
please help me clear my misunderstanding.
--
With Regards,
Deepak
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general