[R] PCA on high dimentional data

2011-12-10 Thread mail me
Hi: I have a large dataset mydata, of 1000 rows and 1000 columns. The rows have gene names and columns have condition names (cond1, cond2, cond3, etc). mydata- read.table(file=c:/file1.mtx, header=TRUE, sep=) I applied PCA as follows: data_after_pca- prcomp(mydata, retx=TRUE, center=TRUE,

Re: [R] PCA on high dimentional data

2011-12-10 Thread Stephen Sefick
By doing PCA you are trying to find a lower dimensional representation of the major variation structure in your data. You get PC* to represent the new data. If you want to know what loads on the axes then you need to look at the loadings. These are the link between the original data and the

Re: [R] PCA on high dimentional data

2011-12-10 Thread Mark Difford
On Dec 10, 2011 at 5:56pm deb wrote: My question is, is there any way I can map the PC1, PC2, PC3 to the original conditions, so that i can still have a reference to original condition labels after PCA? deb, To add to what Stephen has said. Best to do read up on principal component

Re: [R] PCA on high dimentional data

2011-12-10 Thread Bert Gunter
... and adding to what has already been said, PCA can be distorted by non-ellipsoidal distributions or small numbers of unusual values. Careful (chiefly graphical) examination of results is therefore essential, and usually fairly easy to do. There are robust/resistant versions of PCA in R, but