Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-23 Thread Uwe Ligges



masterinex wrote:
Hi Hadley , 


I really apreciate the suggestions you gave, It was helpful , but I still
didnt quite get it all.   and I really want to do a good job , so any
comments would sure come helpful, please understand me . 



Well, we try to understand you, but we do not either. I think you really 
nedc to consult some statistics textbook on PCA if my answer was not 
sufficient. Given your questions, I doubt you understand what PCA does 
and how it works. It does not predict anything.


Uwe Ligges




hadley wrote:

You've asked the same question on stackoverflow.com and received the
same answer.  This is rude because it duplicates effort.  If you
urgently need a response to a question, perhaps you should consider
paying for it.

Hadley

On Sun, Nov 22, 2009 at 12:04 PM, masterinex xevilgan...@hotmail.com
wrote:

so under which cases is it better to  standardize  the data matrix first
?
also  is  PCA generally used to predict the response variable , should I
keep that variable in my data matrix ?


Uwe Ligges-3 wrote:

masterinex wrote:


Hi guys ,

Im trying to do principal component analysis in R . There is 2 ways of
doing
it , I believe.
One is doing  principal component analysis right away the other way is
standardizing the matrix first  using s = scale(m)and then apply
principal
component analysis.
How  do I tell what result is better ? What values in particular should
i
look at . I already managed to find the eigenvalues and eigenvectors ,
the
proportion of  variance for each eigenvector using both methods.


Generally, it is better to standardize. But in some cases, e.g. for the
same units in your variables indicating also the importance, it might
make sense not to do so.
You should think about the analysis, you cannot know which result is
`better' unless you know an interpretation.




I noticed that the proportion of the variance for the first  pca
without
standardizing had a larger  value . Is there a meaning to it ? Isnt
this
always the case?
 At last , if I am  supposed to predict a variable ie weight should I
drop
the variable ie weight from my data matrix when I do principal
component
analysis ?


This sounds a bit like homework. If that is the case, please ask your
teacher rather than this list.
Anyway, it does not make sense to predict weight using a linear
combination (principle component) that contains weight, does it?

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
View this message in context:
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-23 Thread Uwe Ligges



masterinex wrote:

this is how my data matrix looks like . This is just for the first 10
observations , but the pattern is similar for the other observations.  



112.3 154.25  67.75 36.2  93.1  85.2  94.5  59.0 37.3  21.9   32.0   
27.4  17.1
2 6.1 173.25  72.25 38.5  93.6  83.0  98.7  58.7 37.3  23.4   30.5   
28.9  18.2
325.3 154.00  66.25 34.0  95.8  87.9  99.2  59.6 38.9  24.0   28.8   
25.2  16.6
410.4 184.75  72.25 37.4 101.8  86.4 101.2  60.1 37.3  22.8   32.4   
29.4  18.2
528.7 184.25  71.25 34.4  97.3 100.0 101.9  63.2 42.2  24.0   32.2   
27.7  17.7
620.9 210.25  74.75 39.0 104.5  94.4 107.8  66.0 42.0  25.6   35.7   
30.6  18.8
719.2 181.00  69.75 36.4 105.1  90.7 100.3  58.4 38.3  22.9   31.9   
27.8  17.7
812.4 176.00  72.50 37.8  99.6  88.5  97.1  60.0 39.4  23.2   30.5   
29.0  18.8
9 4.1 191.00  74.00 38.1 100.9  82.5  99.9  62.9 38.3  23.8   35.9   
31.1  18.2
10   11.7 198.25  73.50 42.1  99.6  88.6 104.1  63.1 41.7  25.0   35.6   
30.0  19.2



and after standardizing it  . 


1   -0.831228836 -0.898881671 -0.98330178 -0.77420686 -0.952294055
-0.712961621 -0.814552365 -0.0625400993 -0.53901713 -0.825399059 -0.08244945
2   -1.588060506 -0.185928394  0.75868364  0.23560461 -0.889886435
-0.931523054 -0.155497233 -0.1252522485 -0.53901713  0.295114747 -0.59529632
30.755676279 -0.908262635 -1.56396359 -1.74011349 -0.615292906
-0.444727135 -0.077038289  0.0628841989  0.15515266  0.743320270 -1.17652277
4   -1.063161122  0.245595958  0.75868364 -0.24734870  0.133598535
-0.593746294  0.236797489  0.1674044475 -0.53901713 -0.153090775  0.05430971
51.170713001  0.226834030  0.37157577 -1.56449410 -0.428070046 
0.757360745  0.346640011  0.8154299886  1.58687786  0.743320270 -0.01406987
60.218569932  1.202454304  1.72645331  0.45512884  0.470599683 
0.201022552  1.27244  1.4007433805  1.50010664  1.938534997  1.18257281

70.011051571  0.104881496 -0.20908604 -0.68639717  0.545488828
-0.166558039  0.095571389 -0.1879643976 -0.10516101 -0.078389855 -0.11663925
8   -0.819021874 -0.082737788  0.85546060 -0.07172932 -0.140994994
-0.385119472 -0.406565855  0.1465003978  0.37208072  0.145712907 -0.59529632
9   -1.832199755  0.480120063  1.43612241  0.05998522  0.021264819
-0.981196107  0.032804234  0.7527178395 -0.10516101  0.593918429  1.25095239
10  -0.904470611  0.752168024  1.24256848  1.81617909 -0.140994994
-0.375184861  0.691859366  0.7945259389  1.36994980  1.490329474  1.14838302



this is the result of applying PCA to the data matrix

Standard deviations:
 [1] 30.6645414  7.5513852  3.6927427  2.8703435  2.5363007  1.9136933 
1.5624131  1.3689630  1.2976189

[10]  1.1633458  1.1118231  0.7847148  0.4802303

Rotation:
PC1 PC2 PC3  PC4  PC5 
PC6  PC7 PC8

var1  0.18110712 -0.74864138 -0.46070566 -0.365658769  0.192810075
-0.132529979  0.023764851  0.03674873
var2  0.86458284  0.34243386 -0.05766909 -0.235504989 -0.046075934 
0.001493006 -0.024535011  0.13439659

var3  0.03765598  0.20097537 -0.15709612 -0.343218776 -0.295201121
-0.073295697 -0.086930370 -0.54389141
var40.05965733  0.01737951  0.09854179 -0.030801791  0.125735684 
0.341795876 -0.001735808  0.37152696

var5   0.23845698 -0.20616399  0.68948870  0.025904812  0.391188182
-0.428933369 -0.101780281 -0.16965893
var6   0.29928369 -0.47394636  0.24791449  0.341235161 -0.511378719 
0.447071255 -0.077534385 -0.13198544

var7 0.19503685  0.01385823 -0.24126047  0.531403827 -0.127426510
-0.410568454  0.608163973 -0.01265457
var8   0.13261863  0.06839078 -0.37740589  0.535332339  0.366103479 
0.032376851 -0.574484605 -0.05645694
var90.06246705  0.04407384 -0.09545362  0.037993146 -0.036651080 
0.012347288 -0.192976142 -0.13027876

var10   0.03027791  0.05533988 -0.03749859 -0.009257423  0.011026593
-0.010770032 -0.104041067  0.12125263
var11  0.07435322  0.04334969 -0.02666944  0.032036374  0.464035624 
0.454970952  0.347507539 -0.60527541
var12 0.04328710  0.04731771  0.00360668 -0.054200633  0.275901346 
0.297800123  0.324323749  0.30487145
var13   0.02095652  0.02146485  0.03598618 -0.022510780  0.005192075 
0.103988977  0.031541374  0.07877455


   PC9 PC10 PC11PC12 PC13
var1   -0.005328345  0.030549780 -0.049283616 -0.02211988  0.015660892
var2   0.170766596 -0.144031738  0.028862963  0.06984674  0.006293703
var3  -0.282549313  0.548650592  0.131284937 -0.14740722 -0.002384605
var4 0.024070488  0.614154008 -0.551480394 -0.03446124 -0.178123011
var5   -0.157551008  0.147685248  0.008044148 -0.04068258  0.007778992
var6   -0.058675551  0.006344813  0.130814072 -0.04088919 -0.028655330
var7 -0.099243751  0.171852216 -0.149231752 -0.06690208 -0.014693444
var80.006629025  0.199158097  0.187226774 -0.02511968  0.070896819
var9-0.658214712 -0.320120384 -0.53990  0.37630539 -0.023642902
var10   -0.259704149 -0.273030750 -0.074006053 -0.83676032 

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-23 Thread Michael Kubovy

On Nov 22, 2009, at 10:22 AM, Uwe Ligges wrote:

 masterinex wrote:
 Hi guys , Im trying to do principal component analysis in R . There is 2 
 ways of doing
 it , I believe. One is doing  principal component analysis right away the 
 other way is standardizing the matrix first  using s = scale(m)and then 
 apply principal
 component analysis.   How  do I tell what result is better ? What values in 
 particular should i
 look at . I already managed to find the eigenvalues and eigenvectors , the
 proportion of  variance for each eigenvector using both methods.
 
 Generally, it is better to standardize. But in some cases, e.g. for the same 
 units in your variables indicating also the importance, it might make sense 
 not to do so.
 You should think about the analysis, you cannot know which result is `better' 
 unless you know an interpretation.
 
 
 
 I noticed that the proportion of the variance for the first  pca without
 standardizing had a larger  value . Is there a meaning to it ? Isnt this
 always the case?
 At last , if I am  supposed to predict a variable ie weight should I drop
 the variable ie weight from my data matrix when I do principal component
 analysis ?
 
 
 This sounds a bit like homework. If that is the case, please ask your teacher 
 rather than this list.
 Anyway, it does not make sense to predict weight using a linear combination 
 (principle component) that contains weight, does it?
 
 Uwe Ligges

It's likely to have been homework: A quick search on masterinex xevilgang79 
reveal which university this undergraduate student is at. It also produces a 
phone number, which can be used to lookup an address, and a cell phone number.

MK
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-23 Thread masterinex



Actually Its for an assignment Michael 
, all Im looking  is some help  and suggestions , please dont get it wrong ,
and I do believe that 
this is a helpful community .


 
 This sounds a bit like homework. If that is the case, please ask your
 teacher rather than this list.
 Anyway, it does not make sense to predict weight using a linear
 combination (principle component) that contains weight, does it?
 
 Uwe Ligges

It's likely to have been homework: A quick search on masterinex
xevilgang79 reveal which university this undergraduate student is at. It
also produces a phone number, which can be used to lookup an address, and a
cell phone number.

MK
__
R-help@r-project.org mailing list

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-- 
View this message in context: 
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26490273.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread Uwe Ligges

masterinex wrote:



Hi guys , 


Im trying to do principal component analysis in R . There is 2 ways of doing
it , I believe. 
One is doing  principal component analysis right away the other way is 
standardizing the matrix first  using s = scale(m)and then apply principal
component analysis.   
How  do I tell what result is better ? What values in particular should i

look at . I already managed to find the eigenvalues and eigenvectors , the
proportion of  variance for each eigenvector using both methods.



Generally, it is better to standardize. But in some cases, e.g. for the 
same units in your variables indicating also the importance, it might 
make sense not to do so.
You should think about the analysis, you cannot know which result is 
`better' unless you know an interpretation.





I noticed that the proportion of the variance for the first  pca without
standardizing had a larger  value . Is there a meaning to it ? Isnt this
always the case?
 At last , if I am  supposed to predict a variable ie weight should I drop
the variable ie weight from my data matrix when I do principal component
analysis ?



This sounds a bit like homework. If that is the case, please ask your 
teacher rather than this list.
Anyway, it does not make sense to predict weight using a linear 
combination (principle component) that contains weight, does it?


Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread masterinex

so under which cases is it better to  standardize  the data matrix first ?
also  is  PCA generally used to predict the response variable , should I
keep that variable in my data matrix ?


Uwe Ligges-3 wrote:
 
 masterinex wrote:
 
 
 Hi guys , 
 
 Im trying to do principal component analysis in R . There is 2 ways of
 doing
 it , I believe. 
 One is doing  principal component analysis right away the other way is 
 standardizing the matrix first  using s = scale(m)and then apply
 principal
 component analysis.   
 How  do I tell what result is better ? What values in particular should i
 look at . I already managed to find the eigenvalues and eigenvectors ,
 the
 proportion of  variance for each eigenvector using both methods.
 
 
 Generally, it is better to standardize. But in some cases, e.g. for the 
 same units in your variables indicating also the importance, it might 
 make sense not to do so.
 You should think about the analysis, you cannot know which result is 
 `better' unless you know an interpretation.
 
 
 
 I noticed that the proportion of the variance for the first  pca without
 standardizing had a larger  value . Is there a meaning to it ? Isnt this
 always the case?
  At last , if I am  supposed to predict a variable ie weight should I
 drop
 the variable ie weight from my data matrix when I do principal component
 analysis ?
 
 
 This sounds a bit like homework. If that is the case, please ask your 
 teacher rather than this list.
 Anyway, it does not make sense to predict weight using a linear 
 combination (principle component) that contains weight, does it?
 
 Uwe Ligges
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread hadley wickham
You've asked the same question on stackoverflow.com and received the
same answer.  This is rude because it duplicates effort.  If you
urgently need a response to a question, perhaps you should consider
paying for it.

Hadley

On Sun, Nov 22, 2009 at 12:04 PM, masterinex xevilgan...@hotmail.com wrote:

 so under which cases is it better to  standardize  the data matrix first ?
 also  is  PCA generally used to predict the response variable , should I
 keep that variable in my data matrix ?


 Uwe Ligges-3 wrote:

 masterinex wrote:


 Hi guys ,

 Im trying to do principal component analysis in R . There is 2 ways of
 doing
 it , I believe.
 One is doing  principal component analysis right away the other way is
 standardizing the matrix first  using s = scale(m)and then apply
 principal
 component analysis.
 How  do I tell what result is better ? What values in particular should i
 look at . I already managed to find the eigenvalues and eigenvectors ,
 the
 proportion of  variance for each eigenvector using both methods.


 Generally, it is better to standardize. But in some cases, e.g. for the
 same units in your variables indicating also the importance, it might
 make sense not to do so.
 You should think about the analysis, you cannot know which result is
 `better' unless you know an interpretation.



 I noticed that the proportion of the variance for the first  pca without
 standardizing had a larger  value . Is there a meaning to it ? Isnt this
 always the case?
  At last , if I am  supposed to predict a variable ie weight should I
 drop
 the variable ie weight from my data matrix when I do principal component
 analysis ?


 This sounds a bit like homework. If that is the case, please ask your
 teacher rather than this list.
 Anyway, it does not make sense to predict weight using a linear
 combination (principle component) that contains weight, does it?

 Uwe Ligges

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread masterinex

Hi Hadley , 

I really apreciate the suggestions you gave, It was helpful , but I still
didnt quite get it all.   and I really want to do a good job , so any
comments would sure come helpful, please understand me . 





hadley wrote:
 
 You've asked the same question on stackoverflow.com and received the
 same answer.  This is rude because it duplicates effort.  If you
 urgently need a response to a question, perhaps you should consider
 paying for it.
 
 Hadley
 
 On Sun, Nov 22, 2009 at 12:04 PM, masterinex xevilgan...@hotmail.com
 wrote:

 so under which cases is it better to  standardize  the data matrix first
 ?
 also  is  PCA generally used to predict the response variable , should I
 keep that variable in my data matrix ?


 Uwe Ligges-3 wrote:

 masterinex wrote:


 Hi guys ,

 Im trying to do principal component analysis in R . There is 2 ways of
 doing
 it , I believe.
 One is doing  principal component analysis right away the other way is
 standardizing the matrix first  using s = scale(m)and then apply
 principal
 component analysis.
 How  do I tell what result is better ? What values in particular should
 i
 look at . I already managed to find the eigenvalues and eigenvectors ,
 the
 proportion of  variance for each eigenvector using both methods.


 Generally, it is better to standardize. But in some cases, e.g. for the
 same units in your variables indicating also the importance, it might
 make sense not to do so.
 You should think about the analysis, you cannot know which result is
 `better' unless you know an interpretation.



 I noticed that the proportion of the variance for the first  pca
 without
 standardizing had a larger  value . Is there a meaning to it ? Isnt
 this
 always the case?
  At last , if I am  supposed to predict a variable ie weight should I
 drop
 the variable ie weight from my data matrix when I do principal
 component
 analysis ?


 This sounds a bit like homework. If that is the case, please ask your
 teacher rather than this list.
 Anyway, it does not make sense to predict weight using a linear
 combination (principle component) that contains weight, does it?

 Uwe Ligges

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context:
 http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 http://had.co.nz/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26471673.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread masterinex

this is how my data matrix looks like . This is just for the first 10
observations , but the pattern is similar for the other observations.  


112.3 154.25  67.75 36.2  93.1  85.2  94.5  59.0 37.3  21.9   32.0   
27.4  17.1
2 6.1 173.25  72.25 38.5  93.6  83.0  98.7  58.7 37.3  23.4   30.5   
28.9  18.2
325.3 154.00  66.25 34.0  95.8  87.9  99.2  59.6 38.9  24.0   28.8   
25.2  16.6
410.4 184.75  72.25 37.4 101.8  86.4 101.2  60.1 37.3  22.8   32.4   
29.4  18.2
528.7 184.25  71.25 34.4  97.3 100.0 101.9  63.2 42.2  24.0   32.2   
27.7  17.7
620.9 210.25  74.75 39.0 104.5  94.4 107.8  66.0 42.0  25.6   35.7   
30.6  18.8
719.2 181.00  69.75 36.4 105.1  90.7 100.3  58.4 38.3  22.9   31.9   
27.8  17.7
812.4 176.00  72.50 37.8  99.6  88.5  97.1  60.0 39.4  23.2   30.5   
29.0  18.8
9 4.1 191.00  74.00 38.1 100.9  82.5  99.9  62.9 38.3  23.8   35.9   
31.1  18.2
10   11.7 198.25  73.50 42.1  99.6  88.6 104.1  63.1 41.7  25.0   35.6   
30.0  19.2


and after standardizing it  . 

1   -0.831228836 -0.898881671 -0.98330178 -0.77420686 -0.952294055
-0.712961621 -0.814552365 -0.0625400993 -0.53901713 -0.825399059 -0.08244945
2   -1.588060506 -0.185928394  0.75868364  0.23560461 -0.889886435
-0.931523054 -0.155497233 -0.1252522485 -0.53901713  0.295114747 -0.59529632
30.755676279 -0.908262635 -1.56396359 -1.74011349 -0.615292906
-0.444727135 -0.077038289  0.0628841989  0.15515266  0.743320270 -1.17652277
4   -1.063161122  0.245595958  0.75868364 -0.24734870  0.133598535
-0.593746294  0.236797489  0.1674044475 -0.53901713 -0.153090775  0.05430971
51.170713001  0.226834030  0.37157577 -1.56449410 -0.428070046 
0.757360745  0.346640011  0.8154299886  1.58687786  0.743320270 -0.01406987
60.218569932  1.202454304  1.72645331  0.45512884  0.470599683 
0.201022552  1.27244  1.4007433805  1.50010664  1.938534997  1.18257281
70.011051571  0.104881496 -0.20908604 -0.68639717  0.545488828
-0.166558039  0.095571389 -0.1879643976 -0.10516101 -0.078389855 -0.11663925
8   -0.819021874 -0.082737788  0.85546060 -0.07172932 -0.140994994
-0.385119472 -0.406565855  0.1465003978  0.37208072  0.145712907 -0.59529632
9   -1.832199755  0.480120063  1.43612241  0.05998522  0.021264819
-0.981196107  0.032804234  0.7527178395 -0.10516101  0.593918429  1.25095239
10  -0.904470611  0.752168024  1.24256848  1.81617909 -0.140994994
-0.375184861  0.691859366  0.7945259389  1.36994980  1.490329474  1.14838302



this is the result of applying PCA to the data matrix

Standard deviations:
 [1] 30.6645414  7.5513852  3.6927427  2.8703435  2.5363007  1.9136933 
1.5624131  1.3689630  1.2976189
[10]  1.1633458  1.1118231  0.7847148  0.4802303

Rotation:
PC1 PC2 PC3  PC4  PC5 
PC6  PC7 PC8
var1  0.18110712 -0.74864138 -0.46070566 -0.365658769  0.192810075
-0.132529979  0.023764851  0.03674873
var2  0.86458284  0.34243386 -0.05766909 -0.235504989 -0.046075934 
0.001493006 -0.024535011  0.13439659
var3  0.03765598  0.20097537 -0.15709612 -0.343218776 -0.295201121
-0.073295697 -0.086930370 -0.54389141
var40.05965733  0.01737951  0.09854179 -0.030801791  0.125735684 
0.341795876 -0.001735808  0.37152696
var5   0.23845698 -0.20616399  0.68948870  0.025904812  0.391188182
-0.428933369 -0.101780281 -0.16965893
var6   0.29928369 -0.47394636  0.24791449  0.341235161 -0.511378719 
0.447071255 -0.077534385 -0.13198544
var7 0.19503685  0.01385823 -0.24126047  0.531403827 -0.127426510
-0.410568454  0.608163973 -0.01265457
var8   0.13261863  0.06839078 -0.37740589  0.535332339  0.366103479 
0.032376851 -0.574484605 -0.05645694
var90.06246705  0.04407384 -0.09545362  0.037993146 -0.036651080 
0.012347288 -0.192976142 -0.13027876
var10   0.03027791  0.05533988 -0.03749859 -0.009257423  0.011026593
-0.010770032 -0.104041067  0.12125263
var11  0.07435322  0.04334969 -0.02666944  0.032036374  0.464035624 
0.454970952  0.347507539 -0.60527541
var12 0.04328710  0.04731771  0.00360668 -0.054200633  0.275901346 
0.297800123  0.324323749  0.30487145
var13   0.02095652  0.02146485  0.03598618 -0.022510780  0.005192075 
0.103988977  0.031541374  0.07877455

   PC9 PC10 PC11PC12 PC13
var1   -0.005328345  0.030549780 -0.049283616 -0.02211988  0.015660892
var2   0.170766596 -0.144031738  0.028862963  0.06984674  0.006293703
var3  -0.282549313  0.548650592  0.131284937 -0.14740722 -0.002384605
var4 0.024070488  0.614154008 -0.551480394 -0.03446124 -0.178123011
var5   -0.157551008  0.147685248  0.008044148 -0.04068258  0.007778992
var6   -0.058675551  0.006344813  0.130814072 -0.04088919 -0.028655330
var7 -0.099243751  0.171852216 -0.149231752 -0.06690208 -0.014693444
var80.006629025  0.199158097  0.187226774 -0.02511968  0.070896819
var9-0.658214712 -0.320120384 -0.53990  0.37630539 -0.023642902
var10   -0.259704149 -0.273030750 -0.074006053 -0.83676032 -0.348034215
var11   

[R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-21 Thread masterinex



Hi guys , 

Im trying to do principal component analysis in R . There is 2 ways of doing
it , I believe. 
One is doing  principal component analysis right away the other way is 
standardizing the matrix first  using s = scale(m)and then apply principal
component analysis.   
How  do I tell what result is better ? What values in particular should i
look at . I already managed to find the eigenvalues and eigenvectors , the
proportion of  variance for each eigenvector using both methods.



I noticed that the proportion of the variance for the first  pca without
standardizing had a larger  value . Is there a meaning to it ? Isnt this
always the case?
 At last , if I am  supposed to predict a variable ie weight should I drop
the variable ie weight from my data matrix when I do principal component
analysis ?
-- 
View this message in context: 
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26462070.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.