[R] prcomp - principal components in R

2009-11-09 Thread zubin
Hello, not understanding the output of prcomp, I reduce the number of 
components and the output continues to show cumulative 100% of the 
variance explained, which can't be the case dropping from 8 components 
to 3. 

How do i get the output in terms of the cumulative % of the total 
variance, so when i go from total solution of 8 (8 variables in the data 
set), to a reduced number of components, i can evaluate % of variance 
explained, or am I missing something??

8 variables in the data set

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE)
  summary(princ)
Importance of components:
 PC1   PC2   PC3   PC4   PC5   PC6PC7PC8
Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.*

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75)
  summary(princ)

Importance of components:
 PC1   PC2   PC3
Standard deviation 1.381 1.247 1.211
Proportion of Variance 0.387 0.316 0.297
Cumulative Proportion  0.387 0.703 *1.000*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prcomp - principal components in R

2009-11-09 Thread zubin
okay, an extreme case, only 1 component, explains 100%, something weird 
going on..

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.95)
  summary(princ)
Importance of components:
PC1
Standard deviation 1.38
Proportion of Variance 1.00
Cumulative Proportion  1.00

stephen sefick wrote:
 principal components is  a data reduction technique.  It looks like
 you have three axes that account for 100%.  Make this reporducible.

 On Mon, Nov 9, 2009 at 11:37 AM, zubin binab...@bellsouth.net wrote:
   
 Hello, not understanding the output of prcomp, I reduce the number of
 components and the output continues to show cumulative 100% of the
 variance explained, which can't be the case dropping from 8 components
 to 3.

 How do i get the output in terms of the cumulative % of the total
 variance, so when i go from total solution of 8 (8 variables in the data
 set), to a reduced number of components, i can evaluate % of variance
 explained, or am I missing something??

 8 variables in the data set

   princ = prcomp(df[,-1],rotate=varimax,scale=TRUE)
   summary(princ)
 Importance of components:
 PC1   PC2   PC3   PC4   PC5   PC6PC7PC8
 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
 Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.*

   princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75)
   summary(princ)

 Importance of components:
 PC1   PC2   PC3
 Standard deviation 1.381 1.247 1.211
 Proportion of Variance 0.387 0.316 0.297
 Cumulative Proportion  0.387 0.703 *1.000*

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 



   

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prcomp - principal components in R

2009-11-09 Thread stephen sefick
Look at it linearly?

On Mon, Nov 9, 2009 at 11:45 AM, zubin binab...@bellsouth.net wrote:
 okay, an extreme case, only 1 component, explains 100%, something weird
 going on..

   princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.95)
   summary(princ)
 Importance of components:
                        PC1
 Standard deviation     1.38
 Proportion of Variance 1.00
 Cumulative Proportion  1.00

 stephen sefick wrote:
 principal components is  a data reduction technique.  It looks like
 you have three axes that account for 100%.  Make this reporducible.

 On Mon, Nov 9, 2009 at 11:37 AM, zubin binab...@bellsouth.net wrote:

 Hello, not understanding the output of prcomp, I reduce the number of
 components and the output continues to show cumulative 100% of the
 variance explained, which can't be the case dropping from 8 components
 to 3.

 How do i get the output in terms of the cumulative % of the total
 variance, so when i go from total solution of 8 (8 variables in the data
 set), to a reduced number of components, i can evaluate % of variance
 explained, or am I missing something??

 8 variables in the data set

   princ = prcomp(df[,-1],rotate=varimax,scale=TRUE)
   summary(princ)
 Importance of components:
                         PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
 Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
 Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.*

   princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75)
   summary(princ)

 Importance of components:
                         PC1   PC2   PC3
 Standard deviation     1.381 1.247 1.211
 Proportion of Variance 0.387 0.316 0.297
 Cumulative Proportion  0.387 0.703 *1.000*

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.







        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prcomp - principal components in R

2009-11-09 Thread Daniel Malter
In the first PCA you ask how much variance of the EIGHT (!) variables is
captured by the first, second,..., eigth principal component.

In the second PCA you ask how much variance of the THREE (!) variables is
captured by the first, second, and third principal component.

Of course you need only as many PCs as there are variables to capture 100 %
of the variance. Your problem thus comes from the fact that you have eight
variables in the first PCA, which requires eight PCs to capture 100%, and
that you have only three variables in the second PCA, which naturally only
requires three PCs to capture 100% of the variance.

So it's more, yes, you are missing something in this case, rather than that
something is wrong with the analyses.

HTH,
Daniel

-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von zubin
Gesendet: Monday, November 09, 2009 12:37 PM
An: r-help@r-project.org
Betreff: [R] prcomp - principal components in R

Hello, not understanding the output of prcomp, I reduce the number of
components and the output continues to show cumulative 100% of the variance
explained, which can't be the case dropping from 8 components to 3. 

How do i get the output in terms of the cumulative % of the total variance,
so when i go from total solution of 8 (8 variables in the data set), to a
reduced number of components, i can evaluate % of variance explained, or am
I missing something??

8 variables in the data set

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE)
  summary(princ)
Importance of components:
 PC1   PC2   PC3   PC4   PC5   PC6PC7PC8
Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.*

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75)
  summary(princ)

Importance of components:
 PC1   PC2   PC3
Standard deviation 1.381 1.247 1.211
Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion  0.387 0.703
*1.000*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prcomp - principal components in R

2009-11-09 Thread zubin
All 8 variables are still in the analysis, i am just reducing the number 
of components being estimated i thought..


Example 1 component 8 variables, there is no way 1 component explains 
100% of the variance of the 8 variable data set.


 princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.95)
 summary(princ)
Importance of components:
   PC1
Standard deviation 1.38
Proportion of Variance 1.00
Cumulative Proportion  1.00

 summary(princ)

Rotation:
   PC1
VIX0-0.08217686
UUP0-0.18881983
USO0 0.26647346
GLD0 0.26983923
HYG0 0.60674758
term00.18220237
spread0  0.61614047
TNX0 0.18111684




Daniel Malter wrote:

In the first PCA you ask how much variance of the EIGHT (!) variables is
captured by the first, second,..., eigth principal component.

In the second PCA you ask how much variance of the THREE (!) variables is
captured by the first, second, and third principal component.

Of course you need only as many PCs as there are variables to capture 100 %
of the variance. Your problem thus comes from the fact that you have eight
variables in the first PCA, which requires eight PCs to capture 100%, and
that you have only three variables in the second PCA, which naturally only
requires three PCs to capture 100% of the variance.

So it's more, yes, you are missing something in this case, rather than that
something is wrong with the analyses.

HTH,
Daniel

-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von zubin
Gesendet: Monday, November 09, 2009 12:37 PM
An: r-help@r-project.org
Betreff: [R] prcomp - principal components in R

Hello, not understanding the output of prcomp, I reduce the number of
components and the output continues to show cumulative 100% of the variance
explained, which can't be the case dropping from 8 components to 3. 


How do i get the output in terms of the cumulative % of the total variance,
so when i go from total solution of 8 (8 variables in the data set), to a
reduced number of components, i can evaluate % of variance explained, or am
I missing something??

8 variables in the data set

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE)
  summary(princ)
Importance of components:
 PC1   PC2   PC3   PC4   PC5   PC6PC7PC8
Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.*

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75)
  summary(princ)

Importance of components:
 PC1   PC2   PC3
Standard deviation 1.381 1.247 1.211
Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion  0.387 0.703
*1.000*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prcomp - principal components in R

2009-11-09 Thread markleeds

   Hi: I'm not familar with prcomp but with the principal components function
   in bill revelle's  psych package , one can specify the number of components
   one wants to use to build the closest covariance matrix  I don't know
   what tol is doing in your example  but it's not doing  that.
   Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
   Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
   Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
   Â Â Â Â Â Â Â Â Â Â Â Â Â  mark

   On Nov 9, 2009, zubin binab...@bellsouth.net wrote:

 All 8 variables are still in the analysis, i am just reducing the number
 of components being estimated i thought..
 Example 1 component 8 variables, there is no way 1 component explains
 100% of the variance of the 8 variable data set.
  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.95)
  summary(princ)
 Importance of components:
 PC1
 Standard deviation 1.38
 Proportion of Variance 1.00
 Cumulative Proportion 1.00
  summary(princ)
 Rotation:
 PC1
 VIX0 -0.08217686
 UUP0 -0.18881983
 USO0 0.26647346
 GLD0 0.26983923
 HYG0 0.60674758
 term0 0.18220237
 spread0 0.61614047
 TNX0 0.18111684
 Daniel Malter wrote:
  In the first PCA you ask how much variance of the EIGHT (!) variables is
  captured by the first, second,..., eigth principal component.
 
  In the second PCA you ask how much variance of the THREE (!) variables
 is
  captured by the first, second, and third principal component.
 
  Of course you need only as many PCs as there are variables to capture
 100 %
  of the variance. Your problem thus comes from the fact that you have
 eight
  variables in the first PCA, which requires eight PCs to capture 100%,
 and
  that you have only three variables in the second PCA, which naturally
 only
  requires three PCs to capture 100% of the variance.
 
  So it's more, yes, you are missing something in this case, rather than
 that
  something is wrong with the analyses.
 
  HTH,
  Daniel
 
  -
  cuncta stricte discussurus
  -
 
  -Ursprüngliche Nachricht-
  Von: [1]r-help-boun...@r-project.org
 [[2]mailto:r-help-boun...@r-project.org] Im
  Auftrag von zubin
  Gesendet: Monday, November 09, 2009 12:37 PM
  An: [3]r-h...@r-project.org
  Betreff: [R] prcomp - principal components in R
 
  Hello, not understanding the output of prcomp, I reduce the number of
  components and the output continues to show cumulative 100% of the
 variance
  explained, which can't be the case dropping from 8 components to 3.
 
  How do i get the output in terms of the cumulative % of the total
 variance,
  so when i go from total solution of 8 (8 variables in the data set), to
 a
  reduced number of components, i can evaluate % of variance explained, or
 am
  I missing something??
 
  8 variables in the data set
 
   princ = prcomp(df[,-1],rotate=varimax,scale=TRUE)
   summary(princ)
  Importance of components:
  PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
  Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
  Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
   Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762
 *1.*
 
   princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75)
   summary(princ)
 
  Importance of components:
  PC1 PC2 PC3
  Standard deviation 1.381 1.247 1.211
  Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387
 0.703
  *1.000*
 
  [[alternative HTML version deleted]]
 
  __
  [4]r-h...@r-project.org mailing list
  [5]https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 [6]http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 __
 [7]r-h...@r-project.org mailing list
 [8]https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 [9]http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:r-help-boun...@r-project.org
   2. mailto:r-help-boun...@r-project.org
   3. mailto:r-help@r-project.org
   4. mailto:R-help@r-project.org
   5. https://stat.ethz.ch/mailman/listinfo/r-help
   6. http://www.R-project.org/posting-guide.html
   7. mailto:R-help@r-project.org
   8. https://stat.ethz.ch/mailman/listinfo/r-help
   9. http://www.R-project.org/posting-guide.html

Re: [R] prcomp - principal components in R

2009-11-09 Thread Tony Plate

The output of summary prcomp displays the cumulative amount of variance explained 
relative to the total variance explained by the principal components PRESENT in the 
object.  So, it is always guaranteed to be at 100% for the last principal component 
present.  You can see this from the code in summary.prcomp() (see this code with 
getAnywhere(summary.prcomp)).

Here's how to get the output you want (the last line in the transcript below):


set.seed(1)
summary(pc1 - prcomp(x))

Importance of components:
PC1   PC2   PC3   PC4   PC5
Standard deviation 1.175 1.058 0.976 0.916 0.850
Proportion of Variance 0.275 0.223 0.190 0.167 0.144
Cumulative Proportion  0.275 0.498 0.688 0.856 1.000

summary(pc2 - prcomp(x, tol=0.8))

Importance of components:
   PC1   PC2   PC3
Standard deviation 1.17 1.058 0.976
Proportion of Variance 0.40 0.324 0.276
Cumulative Proportion  0.40 0.724 1.000

pc2$sdev

[1] 1.1749061 1.0581362 0.9759016

pc1$sdev

[1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122

svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1)

[1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122

cumsum(pc1$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / 
sqrt(nrow(x)-1))^2)

[1] 0.2752317 0.4984734 0.6883643 0.8558386 1.000


# output in terms of the cumulative % of the total variance
cumsum(pc2$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / 
sqrt(nrow(x)-1))^2)

[1] 0.2752317 0.4984734 0.6883643




It's probably better to get prcomp to compute all the components in the first 
place, because the SVD is the bulk of the computation anyway (so doing it again 
will be slower for large matrices.)  Then just look at the most important 
principal components.  However, there may be a shortcut for computing the 
values of D in the SVD of a matrix -- you could look for that if you have 
demanding computations (e.g., the sqrts of the eigen values of the covariance 
matrix of scaled x: sqrt(eigen(var(scale(x, center=T, scale=F)), 
only.values=T)$values)).

-- Tony Plate


zubin wrote:
Hello, not understanding the output of prcomp, I reduce the number of 
components and the output continues to show cumulative 100% of the 
variance explained, which can't be the case dropping from 8 components 
to 3. 

How do i get the output in terms of the cumulative % of the total 
variance, so when i go from total solution of 8 (8 variables in the data 
set), to a reduced number of components, i can evaluate % of variance 
explained, or am I missing something??


8 variables in the data set

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE)
  summary(princ)
Importance of components:
 PC1   PC2   PC3   PC4   PC5   PC6PC7PC8
Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.*

  princ = prcomp(df[,-1],rotate=varimax,scale=TRUE,tol=.75)
  summary(princ)

Importance of components:
 PC1   PC2   PC3
Standard deviation 1.381 1.247 1.211
Proportion of Variance 0.387 0.316 0.297
Cumulative Proportion  0.387 0.703 *1.000*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.