[R] Question about PCA with prcomp
Hello All, The basic premise of what I want to do is the following: I have 20 entities for which I have ~500 measurements each. So, I have a matrix of 20 rows by ~500 columns. The 20 entities fall into two classes: good and bad. I eventually would like to derive a model that would then be able to classify new entities as being in good territory or bad territory based upon my existing data set. I know that not all ~500 measurements are meaningful, so I thought the best place to begin would be to do a PCA in order to reduce the amount of data with which I have to work. I did this using the prcomp function and found that nearly 90% of the variance in the data is explained by PC1 and 2. So far, so good. I would now like to find out which of the original ~500 measurements contribute to PC1 and 2 and by how much. Any tips would be greatly appreciated! And apologies in advance if this turns out to be an idiotic question. james __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question on Differentiating Two Populations in R
Hello All, Forgive me if this a blatantly newbie question or not germane to the list, but i was wondering if my current approach to my problem is the best way in R. I have two experimental datasets (positive and negative) of differing lengths and a large number of ways of numerically expressing the data by using various scales to represent each data point. I am looking for a scale that will allow me to differentiate between the positive and negative populations. Each dataset is simply a list of numbers: 43 numbers in the positive case and 9 in the negative (small sets, i know, but it's all the data i currently have) and I have hundreds of scales. I assign each dataset to a variable using scan() (each are in separate files). My initial comparison of the two datasets is simply a boxplot with the hope that the two do not overlap too much... Is this the way you would approach this problem? Is there an easier way of doing this in R? Any and all help is greatly appreciated! james __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html