Hi all, I've been using scikit-learn's sparse PCA class to analyze some data, and I'd like to characterize the amount of variance explained by each component. I've consulted off-list about this a bit, and from that correspondence I understand that calculating explained variance for sparse PCA is more complex than for non-sparse PCA because sparse PCA ignores (or reduces the priority of) the orthogonality constraint of non-sparse PCA. However, the original Zou et al. 2006 sparse PCA paper (http://www.tandfonline.com/doi/abs/10.1198/106186006X113430) indicates that the problem is not intractable, and they offer a solution (Eq. 3.19).
The suggestion that I received for calculating explained variance would be implemented in sklearn as follows: from sklearn.decomposition import SparsePCA import numpy as np np.random.seed(1) X = np.random.randn(50,20) spca = SparsePCA() Xr = spca.fit_transform(X) fro_comp0 = np.linalg.norm(np.outer(Xr[0], spca.components_[0]), 'fro') fro_full = np.linalg.norm(X, 'fro') var_exp0 = fro_comp0 ** 2. / fro_full ** 2. print fro_comp0, fro_full, var_exp0 This seems to be very much in line with the Zou et al. suggestion, but my matrix algebra is not up to the task of rigorously evaluating the implementation. What do you think of this approach? Many thanks for any help! Best, Dave -- David E. Warren Associate Department of Neurology Carver College of Medicine University of Iowa Hospitals and Clinics david-e-war...@uiowa.edu ________________________________ Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged. If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited. Please reply to the sender that you have received the message in error, then delete it. Thank you. ________________________________ ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general