[R] prcomp - principal components in R
Hello, not understanding the output of prcomp, I reduce the number of components and the output continues to show cumulative 100% of the variance explained, which can't be the case dropping from 8 components to 3. How do i get the output in terms of the cumulative % of the total variance, so when i go from total solution of 8 (8 variables in the data set), to a reduced number of components, i can evaluate % of variance explained, or am I missing something?? 8 variables in the data set princ = prcomp(df[,-1],rotate=varimax,scale=TRUE) summary(princ) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6PC7PC8 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238 Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.* princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75) summary(princ) Importance of components: PC1 PC2 PC3 Standard deviation 1.381 1.247 1.211 Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387 0.703 *1.000* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prcomp - principal components in R
okay, an extreme case, only 1 component, explains 100%, something weird going on.. princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.95) summary(princ) Importance of components: PC1 Standard deviation 1.38 Proportion of Variance 1.00 Cumulative Proportion 1.00 stephen sefick wrote: principal components is a data reduction technique. It looks like you have three axes that account for 100%. Make this reporducible. On Mon, Nov 9, 2009 at 11:37 AM, zubin binab...@bellsouth.net wrote: Hello, not understanding the output of prcomp, I reduce the number of components and the output continues to show cumulative 100% of the variance explained, which can't be the case dropping from 8 components to 3. How do i get the output in terms of the cumulative % of the total variance, so when i go from total solution of 8 (8 variables in the data set), to a reduced number of components, i can evaluate % of variance explained, or am I missing something?? 8 variables in the data set princ = prcomp(df[,-1],rotate=varimax,scale=TRUE) summary(princ) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6PC7PC8 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238 Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.* princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75) summary(princ) Importance of components: PC1 PC2 PC3 Standard deviation 1.381 1.247 1.211 Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387 0.703 *1.000* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prcomp - principal components in R
Look at it linearly? On Mon, Nov 9, 2009 at 11:45 AM, zubin binab...@bellsouth.net wrote: okay, an extreme case, only 1 component, explains 100%, something weird going on.. princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.95) summary(princ) Importance of components: PC1 Standard deviation 1.38 Proportion of Variance 1.00 Cumulative Proportion 1.00 stephen sefick wrote: principal components is a data reduction technique. It looks like you have three axes that account for 100%. Make this reporducible. On Mon, Nov 9, 2009 at 11:37 AM, zubin binab...@bellsouth.net wrote: Hello, not understanding the output of prcomp, I reduce the number of components and the output continues to show cumulative 100% of the variance explained, which can't be the case dropping from 8 components to 3. How do i get the output in terms of the cumulative % of the total variance, so when i go from total solution of 8 (8 variables in the data set), to a reduced number of components, i can evaluate % of variance explained, or am I missing something?? 8 variables in the data set princ = prcomp(df[,-1],rotate=varimax,scale=TRUE) summary(princ) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238 Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.* princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75) summary(princ) Importance of components: PC1 PC2 PC3 Standard deviation 1.381 1.247 1.211 Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387 0.703 *1.000* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prcomp - principal components in R
In the first PCA you ask how much variance of the EIGHT (!) variables is captured by the first, second,..., eigth principal component. In the second PCA you ask how much variance of the THREE (!) variables is captured by the first, second, and third principal component. Of course you need only as many PCs as there are variables to capture 100 % of the variance. Your problem thus comes from the fact that you have eight variables in the first PCA, which requires eight PCs to capture 100%, and that you have only three variables in the second PCA, which naturally only requires three PCs to capture 100% of the variance. So it's more, yes, you are missing something in this case, rather than that something is wrong with the analyses. HTH, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von zubin Gesendet: Monday, November 09, 2009 12:37 PM An: r-help@r-project.org Betreff: [R] prcomp - principal components in R Hello, not understanding the output of prcomp, I reduce the number of components and the output continues to show cumulative 100% of the variance explained, which can't be the case dropping from 8 components to 3. How do i get the output in terms of the cumulative % of the total variance, so when i go from total solution of 8 (8 variables in the data set), to a reduced number of components, i can evaluate % of variance explained, or am I missing something?? 8 variables in the data set princ = prcomp(df[,-1],rotate=varimax,scale=TRUE) summary(princ) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6PC7PC8 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238 Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.* princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75) summary(princ) Importance of components: PC1 PC2 PC3 Standard deviation 1.381 1.247 1.211 Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387 0.703 *1.000* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prcomp - principal components in R
All 8 variables are still in the analysis, i am just reducing the number of components being estimated i thought.. Example 1 component 8 variables, there is no way 1 component explains 100% of the variance of the 8 variable data set. princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.95) summary(princ) Importance of components: PC1 Standard deviation 1.38 Proportion of Variance 1.00 Cumulative Proportion 1.00 summary(princ) Rotation: PC1 VIX0-0.08217686 UUP0-0.18881983 USO0 0.26647346 GLD0 0.26983923 HYG0 0.60674758 term00.18220237 spread0 0.61614047 TNX0 0.18111684 Daniel Malter wrote: In the first PCA you ask how much variance of the EIGHT (!) variables is captured by the first, second,..., eigth principal component. In the second PCA you ask how much variance of the THREE (!) variables is captured by the first, second, and third principal component. Of course you need only as many PCs as there are variables to capture 100 % of the variance. Your problem thus comes from the fact that you have eight variables in the first PCA, which requires eight PCs to capture 100%, and that you have only three variables in the second PCA, which naturally only requires three PCs to capture 100% of the variance. So it's more, yes, you are missing something in this case, rather than that something is wrong with the analyses. HTH, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von zubin Gesendet: Monday, November 09, 2009 12:37 PM An: r-help@r-project.org Betreff: [R] prcomp - principal components in R Hello, not understanding the output of prcomp, I reduce the number of components and the output continues to show cumulative 100% of the variance explained, which can't be the case dropping from 8 components to 3. How do i get the output in terms of the cumulative % of the total variance, so when i go from total solution of 8 (8 variables in the data set), to a reduced number of components, i can evaluate % of variance explained, or am I missing something?? 8 variables in the data set princ = prcomp(df[,-1],rotate=varimax,scale=TRUE) summary(princ) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6PC7PC8 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238 Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.* princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75) summary(princ) Importance of components: PC1 PC2 PC3 Standard deviation 1.381 1.247 1.211 Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387 0.703 *1.000* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prcomp - principal components in R
Hi: I'm not familar with prcomp but with the principal components function in bill revelle's psych package , one can specify the number of components one wants to use to build the closest covariance matrix I don't know what tol is doing in your example but it's not doing that.                                                                                                                                mark On Nov 9, 2009, zubin binab...@bellsouth.net wrote: All 8 variables are still in the analysis, i am just reducing the number of components being estimated i thought.. Example 1 component 8 variables, there is no way 1 component explains 100% of the variance of the 8 variable data set. princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.95) summary(princ) Importance of components: PC1 Standard deviation 1.38 Proportion of Variance 1.00 Cumulative Proportion 1.00 summary(princ) Rotation: PC1 VIX0 -0.08217686 UUP0 -0.18881983 USO0 0.26647346 GLD0 0.26983923 HYG0 0.60674758 term0 0.18220237 spread0 0.61614047 TNX0 0.18111684 Daniel Malter wrote: In the first PCA you ask how much variance of the EIGHT (!) variables is captured by the first, second,..., eigth principal component. In the second PCA you ask how much variance of the THREE (!) variables is captured by the first, second, and third principal component. Of course you need only as many PCs as there are variables to capture 100 % of the variance. Your problem thus comes from the fact that you have eight variables in the first PCA, which requires eight PCs to capture 100%, and that you have only three variables in the second PCA, which naturally only requires three PCs to capture 100% of the variance. So it's more, yes, you are missing something in this case, rather than that something is wrong with the analyses. HTH, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: [1]r-help-boun...@r-project.org [[2]mailto:r-help-boun...@r-project.org] Im Auftrag von zubin Gesendet: Monday, November 09, 2009 12:37 PM An: [3]r-h...@r-project.org Betreff: [R] prcomp - principal components in R Hello, not understanding the output of prcomp, I reduce the number of components and the output continues to show cumulative 100% of the variance explained, which can't be the case dropping from 8 components to 3. How do i get the output in terms of the cumulative % of the total variance, so when i go from total solution of 8 (8 variables in the data set), to a reduced number of components, i can evaluate % of variance explained, or am I missing something?? 8 variables in the data set princ = prcomp(df[,-1],rotate=varimax,scale=TRUE) summary(princ) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238 Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.* princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75) summary(princ) Importance of components: PC1 PC2 PC3 Standard deviation 1.381 1.247 1.211 Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387 0.703 *1.000* [[alternative HTML version deleted]] __ [4]r-h...@r-project.org mailing list [5]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [6]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [7]r-h...@r-project.org mailing list [8]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [9]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. References 1. mailto:r-help-boun...@r-project.org 2. mailto:r-help-boun...@r-project.org 3. mailto:r-help@r-project.org 4. mailto:R-help@r-project.org 5. https://stat.ethz.ch/mailman/listinfo/r-help 6. http://www.R-project.org/posting-guide.html 7. mailto:R-help@r-project.org 8. https://stat.ethz.ch/mailman/listinfo/r-help 9. http://www.R-project.org/posting-guide.html
Re: [R] prcomp - principal components in R
The output of summary prcomp displays the cumulative amount of variance explained relative to the total variance explained by the principal components PRESENT in the object. So, it is always guaranteed to be at 100% for the last principal component present. You can see this from the code in summary.prcomp() (see this code with getAnywhere(summary.prcomp)). Here's how to get the output you want (the last line in the transcript below): set.seed(1) summary(pc1 - prcomp(x)) Importance of components: PC1 PC2 PC3 PC4 PC5 Standard deviation 1.175 1.058 0.976 0.916 0.850 Proportion of Variance 0.275 0.223 0.190 0.167 0.144 Cumulative Proportion 0.275 0.498 0.688 0.856 1.000 summary(pc2 - prcomp(x, tol=0.8)) Importance of components: PC1 PC2 PC3 Standard deviation 1.17 1.058 0.976 Proportion of Variance 0.40 0.324 0.276 Cumulative Proportion 0.40 0.724 1.000 pc2$sdev [1] 1.1749061 1.0581362 0.9759016 pc1$sdev [1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122 svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1) [1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122 cumsum(pc1$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1))^2) [1] 0.2752317 0.4984734 0.6883643 0.8558386 1.000 # output in terms of the cumulative % of the total variance cumsum(pc2$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1))^2) [1] 0.2752317 0.4984734 0.6883643 It's probably better to get prcomp to compute all the components in the first place, because the SVD is the bulk of the computation anyway (so doing it again will be slower for large matrices.) Then just look at the most important principal components. However, there may be a shortcut for computing the values of D in the SVD of a matrix -- you could look for that if you have demanding computations (e.g., the sqrts of the eigen values of the covariance matrix of scaled x: sqrt(eigen(var(scale(x, center=T, scale=F)), only.values=T)$values)). -- Tony Plate zubin wrote: Hello, not understanding the output of prcomp, I reduce the number of components and the output continues to show cumulative 100% of the variance explained, which can't be the case dropping from 8 components to 3. How do i get the output in terms of the cumulative % of the total variance, so when i go from total solution of 8 (8 variables in the data set), to a reduced number of components, i can evaluate % of variance explained, or am I missing something?? 8 variables in the data set princ = prcomp(df[,-1],rotate=varimax,scale=TRUE) summary(princ) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6PC7PC8 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238 Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.* princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75) summary(princ) Importance of components: PC1 PC2 PC3 Standard deviation 1.381 1.247 1.211 Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387 0.703 *1.000* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.