Dear R-list users, I'm new to principal components and factor analysis. I thought this method can be very useful for me to find relationships between several variables (which I know there is, only don't know which variables exactly and what kind of relation), so as a structure detection method.
Now, I'm experimenting with the function prcomp from the mva package. In my source code below, I of course expect one of the column to be useless (I provided one duplicate column). I know both avg.EDGE.etc and avg.DEGREE have a relation with sum.delivery.penalty. E.g. the bigger avg.DEGREE, the smaller sum.delivery.penalty. My question is about the output of prcomp. I understand the cumulative proportion of variance of the third principal component is 100%. Just like I expected. I see the components are sorted. The one that explains the most variance is listed first. But, how can I figure out what these principal components are exactly? For example PC1. Was is the exact meaning of it? I assumed it is some linear combination of the variables I provided in the call to prcomp, but how can i obtain this linear combination? ps > i used http://www.statsoftinc.com/textbook/stfacan.html as a reference, and help(prcomp/princomp) of course. Thanks for any help! Jonne. # Read a table dir = "..." file = "..." # huge file, 12 Mb stats = read.table(paste(dir, file, sep=""), header=TRUE) # Select several columns data = subset(stats, select = c(sum.delivery.penalty, avg.EDGE.IN.SHORTEST.PATH.COUNT, avg.EDGE.IN.SHORTEST.PATH.COUNT, avg.DEGREE)) require(mva) pc2 = prcomp(data, retx = TRUE, center = TRUE, scale. = TRUE, tol = NULL) pc2 summary(pc2) --- gives the following output > pc2 Standard deviations: [1] 1.424074e+00 1.000000e-00 9.859080e-01 5.711682e-17 Rotation: PC1 PC2 PC3 sum.delivery.penalty -1.627945e-01 -1.539887e-12 9.866600e-01 avg.EDGE.IN.SHORTEST.PATH.COUNT -6.976740e-01 2.413866e-16 -1.151131e-01 avg.EDGE.IN.SHORTEST.PATH.COUNT.1 -6.976740e-01 2.013413e-17 -1.151131e-01 avg.DEGREE 2.505027e-13 -1.000000e+00 -1.519375e-12 PC4 sum.delivery.penalty -1.118300e-17 avg.EDGE.IN.SHORTEST.PATH.COUNT 7.071068e-01 avg.EDGE.IN.SHORTEST.PATH.COUNT.1 -7.071068e-01 avg.DEGREE -3.253830e-18 > summary(pc2) Importance of components: PC1 PC2 PC3 PC4 Standard deviation 1.424 1.000 0.986 5.71e-17 Proportion of Variance 0.507 0.250 0.243 0.00e+00 Cumulative Proportion 0.507 0.757 1.000 1.00e+00 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
