[R] repeating a loop
Hi I have implented boxplots in my script to create box plots BoxplotsCheck - readline(prompt = Would you like to create boxplots for any Feature? (y/n):) if (BoxplotsCheck == y){ BoxplotsFeature - readline(prompt = Which Feature would you like to create a Boxplot for?:) BoxplotsFeature - as.numeric(BoxplotsFeature) BoxplotsData - as.numeric(which(PCIList == BoxplotsFeature)) BoxplotsData - TotalIntensityList[BoxplotsData,] BoxplotsHeading - paste(Tukey boxplot (including outliers) for PCI , BoxplotsFeature , sep = ) bplot(as.numeric(BoxplotsData), GroupingList, style = tukey, outlier = TRUE, col=red, main = BoxplotsHeading, xlab = Groups, ylab = Normalised Intensity, plot = TRUE) BoxplotsFilename - paste(BoxplotsFeature, _Boxplot, sep = ) savePlot(filename = BoxplotsFilename, type = jpeg, device = dev.cur(), restoreConsole = TRUE) RepeatPlot - readline(prompt = Would you like to create another boxplot for any Feature? (y/n):) } If the user inputs y for BoxplotsCheck then a boxplot is saved and creatyed based on a user choice. I want to include the option to do another boxplot if needed (i.e. another user prompt after saveplot returning another y or n for the variable RepeatPlot ) how can i do this. I guess some kind of loop continuing whileRepeatPlot == y Can anyone help [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with plotting plsr loadings
Hi I am attempting to do a loadings plot from a plsr object. I have managed to do this using the gasoline data that comes with the pls package. However when I conduct this on my dataset i get the following error message. plot(BHPLS1, loadings, comps = 1:2, legendpos = topleft, labels = numbers, xlab = nm) Error in loadingplot.default(x, ...) : Could not convert variable names to numbers. str(BHPLS1_Loadings) loadings [1:8892, 1:60] -0.00717 0.00414 0.02611 0.00468 -0.00676 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:8892] PCIList1 PCIList2 PCIList3 PCIList4 ... ..$ : chr [1:60] Comp 1 Comp 2 Comp 3 Comp 4 ... - attr(*, explvar)= Named num [1:60] 2.67 4.14 4.41 3.55 2.59 ... ..- attr(*, names)= chr [1:60] Comp 1 Comp 2 Comp 3 Comp 4 ... Can anyone see the problem?? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Memory Problems (cannot allocate vector of size)
While doing pls I found the following problem BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife = FALSE, validation = LOO) when not enabling jackknife the command works fine, but when trying to enable jackknife i get the following error. BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife = TRUE, validation = LOO) Error: cannot allocate vector of size 289.1 Mb I am dealing with a very large dataset str(PLSdata) 'data.frame': 40 obs. of 2 variables: $ GroupingList: int 1 1 1 1 1 1 1 1 1 1 ... $ PCIList : AsIs [1:40, 1:94727] 0 0 0 0 0 0 0 0 0 0 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr X X.1 X.12 X.13 ... .. ..$ : NULL object.size(PLSdata)/1048600 28.9113560938394 bytes How can i get around this memory shortage [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with PLSR with jack knife
Hi I am analysing a dataset of 40 samples each with 90,000 intensity measures for various peptides. I am trying to identify the Biomarkers (i.e. most significant peptides). I beleive that PLS with jack knifing, or alternativeley CMV(cross-model-validation) are multivariateThe 40 samples belong to four different groups. I have managed to conduct the plsr using the commands: BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, validation = LOO) and BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, validation = CV) I have also used the following command to obtain the loadings BHPLS1_Loadings - loadings(BHPLS1) Now I am unsure of how to utilise these to identify the significant variables. Do I need to use any loops? str(BHPLS1_Loadings) loadings [1:94727, 1:10] -0.00113 -0.03001 -0.00059 -0.00734 -0.02969 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:94727] PCIList1 PCIList2 PCIList3 PCIList4 ... ..$ : chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ... - attr(*, explvar)= Named num [1:10] 14.57 6.62 7.59 5.91 3.26 ... ..- attr(*, names)= chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ... Many thanks in advance AK [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with PLSR Loadings
Hi When I call for the loadings of my plsr using the command, x - loadings(BHPLS1) my loadings contain variable names rather than numbers. str(x) loadings [1:94727, 1:10] -0.00113 -0.03001 -0.00059 -0.00734 -0.02969 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:94727] PCIList1 PCIList2 PCIList3 PCIList4 ... ..$ : chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ... - attr(*, explvar)= Named num [1:10] 14.57 6.62 7.59 5.91 3.26 ... ..- attr(*, names)= chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ... Here is the structure of teh data used to conduct plsr str(PLSdata) 'data.frame': 40 obs. of 2 variables: $ GroupingList: int 1 1 1 1 1 1 1 1 1 1 ... $ PCIList : AsIs [1:40, 1:94727] 42.01749 40.85915 65.01948 55.98204 61.71673 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr X X.1 X.12 X.13 ... .. ..$ : NULL Because of this I am not able to do a loadings plot plot(BHPLS1, loadings, comps = 1:2, legendpos = topleft, labels = numbers, xlab = nm) Error in loadingplot.default(x, ...) : Could not convert variable names to numbers. What am I doing wrong Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Help with PLSR
Hi I am attempting to use plsr which is part of the pls package in r. I amconducting analysis on datasets to identify which proteins/peptides are responsible for the variance between sample groups (Biomarker Spoting) in a multivariate fashion. I have a dataset in R called FullDataListTrans. as you can see below the structure of the data is 40 different rows representing a sample and 94,272 columns each representing a peptide. str(FullDataListTrans) num [1:40, 1:94727] 42 40.9 65 56 61.7 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:40] X X.1 X.12 X.13 ... ..$ : NULL I have also created a vector GroupingList which gives the groupnames for each respective sample(row). GroupingList [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 [39] 4 4 str(GroupingList) int [1:40] 1 1 1 1 1 1 1 1 1 1 ... I am now stuck while conducting the plsr. I have tried various methods of creating structured lists etc and have got nowhere. I have also tried many incarnations of BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = FeaturePresenceExpected[1], data = FullDataListTrans, validation = LOO) Where am I going wrong. Also what is the easiest method to identify which of the 94,000 peptides are most important to the variance between groups. Thanks in advance for any help Amit Patel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Amelia
Hi I have used the amelia command from the Amelia R package. this gives me a number of imputed datasets. This may be a silly question, but i am not a statistician, but I am not sure how to combine these results to obtain the imputed dataset to usse for further statistical analysis. I have looked through the amelia and zelig manuals but still can not find the answer. This maybe because I dont understand the science behind it. Can anyone help Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with knn.impute
Hi I have a dataset from biological data with forty samples whichh relate to four different treatments. Each sample has thousands of values but as usuual contains missing values I want to use knn to imput these missing values. I am doing tthis using knn.impute. Do I need to specify the various groups or can I just use the knn.impute command on the whole dataset together. Also I am setting the maxp argument to the total number of values for each sample. Is this the correct thing to do? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with amelia
Hi I have a dataset from biological data with forty samples whichh relate to four different treatments. Each sample has thousands of values but as usuual contains missing values I want to use EM to imput these missing values. I am doing tthis using amelia. Do I need to specify the various groups or can I just use the amelia command on the whole dataset (zz) together. zzAmelia - amelia(zz[,-1], m=5) Also im not sure which value of m should be used Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with decimal points
Hi I have found a little problem with an R script. I am trying to merge some data and am finding something unusual going on. As shown below I am trying to assign (MatchedValues[Value2,Value]) to (ClusteredData[k,Value]) which are two separate dataframes. 1) By the following command you can see that the value im transferring is 481844.03 MatchedValues[Value2,Value] [1] 481844.03 6618 Levels: 1.00E+07 1.01E+07 1.02E+07 1.04E+07 1.05E+07 1.06E+07 ... Raw 2) But when I try to replace the values using the command i get a value of 4420 ClusteredData[k,Value] - MatchedValues[Value2,Value] ClusteredData[k,Value] [1] 4420 3) So what am I not doing. How can I keep that same value of 481844.03 I have tried as.double(MatchedValues[Value2,Value]) [1] 4420 as.numeric(MatchedValues[Value2,Value]) [1] 4420 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with unexpected symbol errors
Hi I have got a long script which will not run for me as i keep getting errors : source(clusterfixV1_4.r) Error in source(clusterfixV1_4.r) : clusterfixV1_4.r: unexpected symbol at 158: eck[k,2] - as.numeric(1) 159: #ClusterInfo[k,2] - Clustered I have sorted all the ones i can but i am having a problem here Can anyone tell me the cause of these problems. Its not a very short or straightforward script so i dont expect you to go through the whole thing but it would be great if you can give me an indication as to what I may be doing wrong. I havent attached the data that the script uses because I'm pretty sure the principles i have used are right when running the script i get the above error on line 158 and 159 I am more than happy to provide further information(e.g the dataset) if it helps Many thanks in advance Amit Patel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Error in rowSums REPOST
For the query below I have also included the follwing information. Thanks for your replies str(FeaturePresenceMatrix) chr [1:65530, 1:40] 0 0 0 0 1 0 0 0 0 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:65530] 4 5 6 7 ... ..$ : chr [1:40] X1 X2 X3 X4 ... ?class class(FeaturePresenceMatrix) [1] matrix Amit Patel wrote: Hi I am trying to calculate the row sums of a matrix i have created The matrix ( FeaturePresenceMatrix) has been created by 1) Read csv 2) Removing unnecesarry data using [-1:4,] command 3) replacing all the NA values with as.numeric(0) and all others with as.numeric (1) When I carry out the command TotalFeature - rowrowSums(FeaturePresenceMatrix, na.rm = TRUE) I get the following error. Error in rowSums(FeaturePresenceMatrix, na.rm = TRUE) : 'x' must be numeric Any tips onhow I can get round this? Yes, follow the posting guide and give the list a reproducible example. We don't know a critical piece of information, the class of your data. We know it's *not* numeric though, which is what it needs to be. Use ?class, ?str, and possibly give us a small sample with ?dput. That way, we can reproduce the error. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in rowSums
Hi I am trying to calculate the row sums of a matrix i have created The matrix ( FeaturePresenceMatrix) has been created by 1) Read csv 2) Removing unnecesarry data using [-1:4,] command 3) replacing all the NA values with as.numeric(0) and all others with as.numeric (1) When I carry out the command TotalFeature - rowrowSums(FeaturePresenceMatrix, na.rm = TRUE) I get the following error. Error in rowSums(FeaturePresenceMatrix, na.rm = TRUE) : 'x' must be numeric Any tips onhow I can get round this? Thanks in Advance Amit __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help With ANOVA
Hi I needed some help with ANOVA I have a problem with My ANOVA analysis. I have a dataset with a known ANOVA p-value, however I can not seem to re-create it in R. I have created a list (zzzanova) which contains 1)Intensity Values 2)Group Number (6 Different Groups) 3)Sample Number (54 different samples) this is created by the script in Appendix 1 I then conduct ANOVA with the command zzz.aov - aov(Intensity ~ Group, data = zzzanova) I get a p-value of Pr(F)1 0.9483218 The expected p-value is 0.00490 so I feel I maybe using ANOVA incorrectly or have put in a wrong formula. I am trying to do an ANOVA analysis across all 6 Groups. Is there something wrong with my formula. But I think I have made a mistake in the formula rather than anything else. APPENDIX 1 datalist - c(-4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, 3.003749, -4.60517, 2.045314, 2.482557, -4.60517, -4.60517, -4.60517, -4.60517, 1.592743, -4.60517, -4.60517, 0.91328, -4.60517, -4.60517, 1.827744, 2.457795, 0.355075, -4.60517, 2.39127, 2.016987, 2.319903, 1.146683, -4.60517, -4.60517, -4.60517, 1.846162, -4.60517, 2.121427, 1.973118, -4.60517, 2.251568, -4.60517, 2.270724, 0.70338, 0.963816, -4.60517, 0.023703, -4.60517, 2.043382, 1.070586, 2.768289, 1.085169, 0.959334, -0.02428, -4.60517, 1.371895, 1.533227) zzzanova - structure(list(Intensity = c(t(Samp1), t(Samp2), t(Samp3), t(Samp4)), Group = structure(c(1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,3, 4,4,4,4,4,4,4,4,4,4, 5,5,5,5,5,5,5,5,5, 6,6,6,6,6,6,6,6,6), .Label = c(Group1, Group2, Group3, Group4, Group5, Group6), class = factor), Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54) )) , .Names = c(Intensity, Group, Sample), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54),class = data.frame) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help With ANOVA (corrected please ignore last email)
Sorry i had a misprint in the appendix code in the last email Hi I needed some help with ANOVA I have a problem with My ANOVA analysis. I have a dataset with a known ANOVA p-value, however I can not seem to re-create it in R. I have created a list (zzzanova) which contains 1)Intensity Values 2)Group Number (6 Different Groups) 3)Sample Number (54 different samples) this is created by the script in Appendix 1 I then conduct ANOVA with the command zzz.aov - aov(Intensity ~ Group, data = zzzanova) I get a p-value of Pr(F)1 0.9483218 The expected p-value is 0.00490 so I feel I maybe using ANOVA incorrectly or have put in a wrong formula. I am trying to do an ANOVA analysis across all 6 Groups. Is there something wrong with my formula. But I think I have made a mistake in the formula rather than anything else. APPENDIX 1 datalist - c(-4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, 3.003749, -4.60517, 2.045314, 2.482557, -4.60517, -4.60517, -4.60517, -4.60517, 1.592743, -4.60517, -4.60517, 0.91328, -4.60517, -4.60517, 1.827744, 2.457795, 0.355075, -4.60517, 2.39127, 2.016987, 2.319903, 1.146683, -4.60517, -4.60517, -4.60517, 1.846162, -4.60517, 2.121427, 1.973118, -4.60517, 2.251568, -4.60517, 2.270724, 0.70338, 0.963816, -4.60517, 0.023703, -4.60517, 2.043382, 1.070586, 2.768289, 1.085169, 0.959334, -0.02428, -4.60517, 1.371895, 1.533227) zzzanova - structure(list(Intensity = datalist, Group = structure(c(1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,3, 4,4,4,4,4,4,4,4,4,4, 5,5,5,5,5,5,5,5,5, 6,6,6,6,6,6,6,6,6), .Label = c(Group1, Group2, Group3, Group4, Group5, Group6), class = factor), Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54) )) , .Names = c(Intensity, Group, Sample), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54),class = data.frame) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Loops
Hi I have tried many attempts but cant get the loop right, as I am not a strong programmer. What I am basically trying to do is compare 2 spreadsheets. The problem is that one of them only contain a portion of the overall data (TESTSAMP), where the other has a full datasetFULLSAMP. From the complete set I would like to remove the rows of data which are not in the TESTSAMP. Column 1 contains the sample numbers which can be used to identify samples. Does anyone have any suggestions? I have tried various things like double loops and so on, but I am sure there is an easier way or function to do this. i tried this method, but Im not sure how to only keep looping until a match is found. I dont understand how repeat loops work in R. for (i in 1:length(FULLSAMP[,1])) { if (FULLSAMP[i,1] != TESTSAMP[i,1]) { FULLSAMP - FULLSAMP[-i,] } Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with TukeyHSD
Hi I am conducting ANOVA using the aov function I am also conducting TukeyHSD to obtain which of the groups show variance How can I obtain the first three p values from the list below? zzz.aov - aov(Intensity ~ Group, data = zzzanova) TukeyHSD(zzz.aov) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = Intensity ~ Group, data = zzzanova) $Group diff lwr upr p adj Group2-Group1 0.778354812 -3.414233 4.970943 0.9607836 Group3-Group1 -0.734044848 -5.073786 3.605696 0.9698685 Group4-Group1 -0.000158625 -4.192747 4.192429 1.000 Group3-Group2 -1.512399661 -5.852140 2.827341 0.7933015 Group4-Group2 -0.778513438 -4.971101 3.414075 0.9607611 Group4-Group3 0.733886223 -3.605855 5.073627 0.9698870 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with ANOVA in R
Hi I am attempting Anova analysis to compare results from four groups (Samp1-4) which are lists of intensities from the experiment. I am doing this by first creating a structured list of the data and then conducting the ANOVA (Script provided below). Im an R beginner so am not sure if I am using this correctly. Two major questions I have are: 1) Is using the code (zzz.aov - aov(Intensity ~ Group + Error(Sample), data = zzzanova)) the correct method to calculate the variances between the four groups (samp 1-4). I am unsure of the inclusion of the error portion. 2) I beleive this method (aov) assumes equal variances. How can I adjust this to do an ANOVA with unequal variances #SCRIPT STARTS #Creates a structured list suitable for ANOVA analysis # Intensity Group (1,2,3,4) Sample(1:62) zzzanova - structure(list(Intensity = c(t(Samp1), t(Samp2), t(Samp3), t(Samp4)), Group = structure(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,3,3,3,3,3,3, 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4), .Label = c(Group1, Group2, Group3, Group4), class = factor), Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62) )) , .Names = c(Intensity, Group, Sample), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,62),class = data.frame) #Conducts the ANOVA for that PCI zzz.aov - aov(Intensity ~ Group + Error(Sample), data = zzzanova) #SCRIPT ENDS THANKS IN ADVANCE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ANOVA questions
I am attempting Anova analysis to compare results from four groups (Samp1-4) which are lists of intensities from the experiment. I am doing this by first creating a structured list of the data and then conducting the ANOVA (Script provided below). Im an R beginner so am not sure if I am using this correctly. Two major questions I have are: 1) Is using the code (zzz.aov - aov(Intensity ~ Group + Error(Sample), data = zzzanova)) the correct method to calculate the variances between the four groups (samp 1-4). I am unsure of the inclusion of the error portion. 2) I beleive this method (aov) assumes equal variances. How can I adjust this to do an ANOVA with unequal variances #SCRIPT STARTS #Creates a structured list suitable for ANOVA analysis # Intensity Group (1,2,3,4) Sample(1:62) zzzanova - structure(list(Intensity = c(t(Samp1), t(Samp2), t(Samp3), t(Samp4)), Group = structure(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,3,3,3,3,3,3, 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4), .Label = c(Group1, Group2, Group3, Group4), class = factor), Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62) )) , .Names = c(Intensity, Group, Sample), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,62),class = data.frame) #Conducts the ANOVA for that PCI zzz.aov - aov(Intensity ~ Group + Error(Sample), data = zzzanova) #SCRIPT ENDS THANKS IN ADVANCE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bartlett Test
Hi I am trying to conduct a Bartlett test between two groups Samp 1 and Samp 2, both of which are vectors of equal length. I cant find any information on how to do this. Does the data need to be in a structured list. Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bartlett Test
Apparently the vectors that I have are lists and when I try to conduct the Bartlett test I am getting errors. I may be using it wrong. I am not an R expert xbartlett2 - c(Samp1,Samp2) #Samp1 2 are lists gbartlett24 - c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2) Attempt 1 gazant - bartlett.test(xbartlett2~gbartlett24)$p.value Error in model.frame.default(formula = xbartlett2 ~ gbartlett24) : invalid type (list) for variable 'xbartlett2' xbartlett2 - c(Samp1,Samp2) gazant - bartlett.test(xbartlett2, gbartlett24)$p.value Error in bartlett.test.default(xbartlett2, gbartlett24) : there must be at least 2 observations in each group (There are at least 2 unique observations in each group) Thanks in advance - Original Message From: Richardson, Patrick patrick.richard...@vai.org To: Amit Patel amitrh...@yahoo.co.uk; r-help@r-project.org r-help@r-project.org Sent: Mon, 1 March, 2010 14:31:15 Subject: RE: [R] Bartlett Test ?bartlett.test From the help page If x is a list, its elements are taken as the samples or fitted linear models to be compared for homogeneity of variances. In this case, the elements must either all be numeric data vectors or fitted linear model objects, g is ignored, and one can simply use bartlett.test(x) to perform the test. If the samples are not yet contained in a list, use bartlett.test(list(x, ...)). Otherwise, x must be a numeric data vector, and g must be a vector or factor object of the same length as x giving the group for the corresponding elements of x. Patrick -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Amit Patel Sent: Monday, March 01, 2010 9:26 AM To: r-help@r-project.org Subject: [R] Bartlett Test Hi I am trying to conduct a Bartlett test between two groups Samp 1 and Samp 2, both of which are vectors of equal length. I cant find any information on how to do this. Does the data need to be in a structured list. Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with averaging
Hi I am using the following script to average a set of data 0f 62 columns into 31 colums. The data consists of values of ln(0.01) or -4.60517 instead of NA's. These need to be averaged for each row (i.e 2 values being averaged). What I would I need to change for me to meet the conditions: 1. If each run of the sample has a value, the average is given 2. If only one run of the sample has a value, that value is given (i.e. no averaging) 3. If both runs have missing values, NA is given. I have tried changing (na.strings = NA) to (na.strings = -4.60517) but this causes all the pairs with even one NA return an NA value. I would prefer not to use for loops as this would slow the script down considerably. #SCRIPT STARTS rm(list=ls()) setwd(C:/Documents and Settings/Amit Patel) #na.strings makes na's readable zz - read.csv(Pr9549_LabelFreeData_ByExperimentAK.csv,strip.white = TRUE, na.strings = NA) ix - seq(from=2,to=62, by=2) averagedResults - (zz[,ix] + zz[,ix+1])/2 averagedResults - cbind(zz[,1],averagedResults ) colnames(averagedResults) - c(PCI,G1-C1,G1-C2,G1-C3,G1-C4,G1-C5,G1-C6,G1-C7,G1-C8, G2-C9,G2-C10,G2-C11,G2-C12,G2-C13,G2-C14,G2-C15,G2-C16, G3-C17,G3-C18,G3-C19,G3-C20,G3-C21,G3-C22,G3-C23, G4-C24,G4-C25,G4-C26,G4-C27,G4-C28,G4-C29,G4-C30,G4-C31) write.csv(averagedResults, file = Pr9549_averagedreplicates.csv, row.names = FALSE) #SCRIPT ENDS __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] T.test error help
Hi I am implementing the t.test in a loop and where the data is the same i get an error message. Error in t.test.default(Samp3, Samp1, na.rm = TRUE, var.equal = FALSE, : data are essentially constant The script i am using is for (i in 1:length(zz[,1])) { Samp1 - zz[i,2:17] Samp2 - zz[i,18:33] Samp3 - zz[i,34:47] Samp4 - zz[i,48:63] TTestResult[i,2] - t.test(Samp2, Samp1, na.rm=TRUE, var.equal = FALSE, paired=FALSE, conf.level=0.95)$p.value TTestResult[i,3] - t.test(Samp3, Samp1, na.rm=TRUE, var.equal = FALSE, paired=FALSE, conf.level=0.95)$p.value TTestResult[i,4] - t.test(Samp4, Samp1, na.rm=TRUE, var.equal = FALSE, paired=FALSE, conf.level=0.95)$p.value } Is there a way to make my loop ignore this problem and go onto the next iteration __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with if statements
Hi I am trying to create a column in a data frame which gives a sigificane score from 0-7. It should read values from 7 different colums and add 1 to the counter if the value is =0.05. I get an error message saying Error in if (ALLRESULTS[i, 16] = 0.05) significance_count = significance_count + : missing value where TRUE/FALSE needed The script is included below it works if i convert the NA values to zero but this is not appropriate as it includes the zero as significant. ANY SUGGESTIONS #SCRIPT STARTS for (i in 1:length(ALLRESULTS[,1])) { significance_count = 0 if (ALLRESULTS[i,16] = 0.05 ) significance_count = significance_count +1 else significance_count = significance_count if (ALLRESULTS[i,17] = 0.05 ) significance_count = significance_count +1 else significance_count = significance_count if (ALLRESULTS[i,18] = 0.05 ) significance_count = significance_count +1 else significance_count = significance_count if (ALLRESULTS[i,19] = 0.05 ) significance_count = significance_count +1 else significance_count = significance_count if (ALLRESULTS[i,20] = 0.05 ) significance_count = significance_count +1 else significance_count = significance_count if (ALLRESULTS[i,21] = 0.05 ) significance_count = significance_count +1 else significance_count = significance_count if (ALLRESULTS[i,22] = 0.05 ) significance_count = significance_count +1 else significance_count = significance_count ALLRESULTS[i,23] - significance_count} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help to speed up loops in r
Hi i am using a script which involves the following loop. It attempts to reduce a data frame(zz) of 95000 * 41 down to a data frame (averagedreplicates) of 95000 * 21 by averaging the replicate values as you can see in the script below. This script however is very slow (2days). Any suggestions to speed it up. NB I have also tried using rowMeans rather than adding the 2 values and dividing by 2. (same problem) #SCRIPT STARTS for (i in 1:length(averagedreplicates[,1])) #for (i in 1:dim(averagedreplicates)[1]) { cat(i,'\n') #calculates Meanss #Sample A averagedreplicates[i,2] - (zz[i,2] + zz[i,3])/2 averagedreplicates[i,3] - (zz[i,4] + zz[i,5])/2 averagedreplicates[i,4] - (zz[i,6] + zz[i,7])/2 averagedreplicates[i,5] - (zz[i,8] + zz[i,9])/2 averagedreplicates[i,6] - (zz[i,10] + zz[i,11])/2 #Sample B averagedreplicates[i,7] - (zz[i,12] + zz[i,13])/2 averagedreplicates[i,8] - (zz[i,14] + zz[i,15])/2 averagedreplicates[i,9] - (zz[i,16] + zz[i,17])/2 averagedreplicates[i,10] - (zz[i,18] + zz[i,19])/2 averagedreplicates[i,11] - (zz[i,20] + zz[i,21])/2 #Sample C averagedreplicates[i,12] - (zz[i,22] + zz[i,23])/2 averagedreplicates[i,13] - (zz[i,24] + zz[i,25])/2 averagedreplicates[i,14] - (zz[i,26] + zz[i,27])/2 averagedreplicates[i,15] - (zz[i,28] + zz[i,29])/2 averagedreplicates[i,16] - (zz[i,30] + zz[i,31])/2 #Sample D averagedreplicates[i,17] - (zz[i,32] + zz[i,33])/2 averagedreplicates[i,18] - (zz[i,34] + zz[i,35])/2 averagedreplicates[i,19] - (zz[i,36] + zz[i,37])/2 averagedreplicates[i,20] - (zz[i,38] + zz[i,39])/2 averagedreplicates[i,21] - (zz[i,40] + zz[i,41])/2 } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help wth boxplots
Hi I have a vector of data lets call zz (40 values from 4 samples) the data is already in groups, i can even split up the samples using SampA - zz[,2:11] SampB - zz[,12:21] SampC - zz[,22:31] SampV - zz[,32:41] I would like an output that gives me 4 boxplots on one plot one boxplot for the set of 10 values how can i do this in R __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with loops
Hi I am trying to create a loop which averages replicates in my data. The original data has many rows. and consists of 40 column zz[,2:41] plus row headings in zz[,1] I am trying to average each set of values (i.e. zz[1,2:3] averaged and placed in average_value[1,2] and so on. below is my script but it seems to be stuck in an endless loop Any suggestions?? for (i in 1:length(average_value[,1])) { average_value[i] - i^100; print(average_value[i]) #calculates Meanss #Sample A average_value[i,2] - rowMeans(zz[i,2:3]) average_value[i,3] - rowMeans(zz[i,4:5]) average_value[i,4] - rowMeans(zz[i,6:7]) average_value[i,5] - rowMeans(zz[i,8:9]) average_value[i,6] - rowMeans(zz[i,10:11]) #Sample B average_value[i,7] - rowMeans(zz[i,12:13]) average_value[i,8] - rowMeans(zz[i,14:15]) average_value[i,9] - rowMeans(zz[i,16:17]) average_value[i,10] - rowMeans(zz[i,18:19]) average_value[i,11] - rowMeans(zz[i,20:21]) #Sample C average_value[i,12] - rowMeans(zz[i,22:23]) average_value[i,13] - rowMeans(zz[i,24:25]) average_value[i,14] - rowMeans(zz[i,26:27]) average_value[i,15] - rowMeans(zz[i,28:29]) average_value[i,16] - rowMeans(zz[i,30:31]) #Sample D average_value[i,17] - rowMeans(zz[i,32:33]) average_value[i,18] - rowMeans(zz[i,34:35]) average_value[i,19] - rowMeans(zz[i,36:37]) average_value[i,20] - rowMeans(zz[i,38:39]) average_value[i,21] - rowMeans(zz[i,40:41]) } thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Help with loops(corrected question)
--- On Fri, 15/5/09, Amit Patel amitrh...@yahoo.co.uk wrote: From: Amit Patel amitrh...@yahoo.co.uk Subject: Help with loops To: r-help@r-project.org Date: Friday, 15 May, 2009, 12:17 PM Hi I am trying to create a loop which averages replicates in my data. The original data has many rows. and consists of 40 column zz[,2:41] plus row headings in zz[,1] I am trying to average each set of values (i.e. zz[1,2:3] averaged and placed in average_value[1,2] and so on. below is my script but it seems to be stuck in an endless loop Any suggestions?? for (i in 1:length(zz[,1])) { #calculates Meanss #Sample A average_value[i,2] - rowMeans(zz[i,2:3]) average_value[i,3] - rowMeans(zz[i,4:5]) average_value[i,4] - rowMeans(zz[i,6:7]) average_value[i,5] - rowMeans(zz[i,8:9]) average_value[i,6] - rowMeans(zz[i,10:11]) #Sample B average_value[i,7] - rowMeans(zz[i,12:13]) average_value[i,8] - rowMeans(zz[i,14:15]) average_value[i,9] - rowMeans(zz[i,16:17]) average_value[i,10] - rowMeans(zz[i,18:19]) average_value[i,11] - rowMeans(zz[i,20:21]) #Sample C average_value[i,12] - rowMeans(zz[i,22:23]) average_value[i,13] - rowMeans(zz[i,24:25]) average_value[i,14] - rowMeans(zz[i,26:27]) average_value[i,15] - rowMeans(zz[i,28:29]) average_value[i,16] - rowMeans(zz[i,30:31]) #Sample D average_value[i,17] - rowMeans(zz[i,32:33]) average_value[i,18] - rowMeans(zz[i,34:35]) average_value[i,19] - rowMeans(zz[i,36:37]) average_value[i,20] - rowMeans(zz[i,38:39]) average_value[i,21] - rowMeans(zz[i,40:41]) } thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help- extracting values
I have csv files imported in r each with 2 columns and many many rows. I have sorted the data in them but want to extract some values. The first column is an ID The second is a p-value ( now sorted in increasing order with NA's last) I want to extract the rows with a p-value of less than 0.05) What commands would help the table is called AnovaSort with column headings MCI p-value Many thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with ANOVA p-values
Hi I have done ANOVA on a dataset (See Below) but am having problems retreiving the p-value. I am assuming that Pr(F) is the p-value but cannot get this value or in fact any other value (e.g. DF) from the summary.Any suggestions?? I have tried sum-summary(zzz.aov) sum$Pr(F) Error: unexpected '' in sum$Pr( sum$Pr(F) NULL sum$Df NULL zzz.aov - aov(Intensity ~ Group + Error(Sample), data = zzzanova) summary(zzz.aov) Error: Sample Df Sum SqMean Sq Group 1 6.0313e+10 6.0313e+10 Error: Within Df Sum SqMean Sq F value Pr(F) Group 3 2.6012e+10 8.6707e+09 0.2934 0.8299 Residuals 34 1.0049e+12 2.9556e+10 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Wilcoxon Test
Hi I have 2 sets of data that I want to do a Wilcoxon test on. They are of the same dimension. One has 4 zero values and the other has 5. dim(SampA) [1] 1 10 dim(SampV) [1] 1 10 I get the folowing error Error in wilcox.test.default(SampA, SampV, na.rm = TRUE, paired = FALSE, : 'x' must be numeric I am using the function wilcox.test(SampA, SampV, na.rm=TRUE, paired=FALSE, conf.level=0.95) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with t.test
Hi I have managed to do a paired t-test with a data set i have 5 colums of data im dealing with GENE SampA SampB SampC SampVehicle ctcc 859 na145 24 gtcg 45 5 54 69 and so on but they are much larger columns each column has been split and i can do t.test on for eg sampA by doing t.test(sampA, SampVehicle, na.rm=TRUE, paired=TRUE, conf.level=0.95) what can I do to be able to identify which of the genes are responsible for the most variance or are the most significant. THANKS IN ADVANCE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with t.test
Hi I have managed to do a paired t-test with a data set i have 5 colums of data im dealing with GENE             SampA             SampB             SampC                  SampVehicle ctcc                  859                    na                    145                          24 gtcg                  45                        5                      54                          69 and so on but they are much larger columns each column has been split and i can do t.test on for eg sampA by doing t.test(sampA, SampVehicle, na.rm=TRUE, paired=TRUE, conf.level=0.95) what can I do to be able to identify which of the genes are responsible for the most variance or are the most significant. THANKS IN ADVANCE [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with reshaping
hi Im having some problems reshaping Ive managed to apply it but have some problems the attached document will explain any help is appreciated __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with reshaping (no file attached)
MCI A1 A2 A13 A14 A23 A24 A33 A34 Grouped together 56766 N/A N/A N/A N/A N/A N/A N/A N/A N/A 6459 N/A N/A N/A N/A N/A N/A N/A N/A N/A 31233 N/A N/A N/A N/A N/A N/A 71280.7 N/A N/A 16790 N/A N/A N/A N/A N/A N/A N/A N/A N/A 13392 284699.6 N/A N/A N/A N/A N/A N/A N/A N/A 1575 N/A 1196152 1236735 1322735 1100289 887130.2 N/A N/A N/A Figure 1 - Takeda2_nas.csv Trying to get data in suitable format for Genstat #clear console rm(list=ls())  install.packages(reshape) library(reshape)  #Read in data #Takeda - read.table(Takeda2_nas.csv, sep = ,, header = TRUE, row.names = 1) #Takedastack - cbind(Takeda[gl(nrow(Takeda), 1, 40*nrow(Takeda)), 1], stack(Takeda[, 1:41]))   zz - read.csv(Takeda2_nas.csv,strip.white = TRUE) #zzz - cbind(zz[gl(nrow(zz), 1, 40*nrow(zz)), 1], stack(zz[, 2:41])) #Use reshape function to change data zzz - reshape(zz,varying=list(c(A1,A2,A13,A14,A23,A24,A33,A34,A39,A40,B9,B10,B5,B6,B15,B16,B27,B28,B31,B32,C3,C4,C7,C8,C11,C12,C17,C18,C21,C22,V19,V20,V25,V26,V29,V30,V35,V36,V37,V38)),direction=long) #not ideal result i wanted #write a table to excel write.table(zzz, Takedashift.csv, sep=,) Script 1 - datashift.r              The result with above commands MCI (NONSENSE) Time (Actually MCI) A1 (1 = A1 and so on) Id (Intensity) (COUNT) 1.1 56766 1 N/A 1 2.1 6459 1 N/A 2 3.1 31233 1 N/A 3   Want the data to be MCI ID(sample) Intensity 56766 A1 N/A 6459 A1 N/A 31233 A1 N/A  [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ANOVA in R
Hi I Have a very large dataset that I would like to conduct ANOVA tests on. Im not a very strong programmer so any help would be appreciated. the format is Identifier            A1      A2       B1     B2      C1  C2     Norm1        Norm2 1234                 1       1           NA    NA     4      3       NA              NA 4567                 2       2             4     4        8      8      9                   9 and so on I have 10 runs for 3 different doses plus the normal state. Any help greatly appreciated [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] t-test
When doing the t-test in the below manner will r compare each element of the array with the relevant one. I.e. if i was comparing x and y would (1 and 0) and (1 and 9) be treated as separate variables. Or does it just assume one variable. # test data x - c(1,1.1,1.15,1.2,1.21,1.23) y - c(0.9,1,1.16,1.18,1.19,1.2) z - c(1.4,1.42,1.43,1.44,1.45,1.46) ###Â Student's t-test # for help in R type ?t.test() # defaults are: # alternative = two.sided i.e. two-sided t-test # var.equal = FALSE i.e. unequal variance # note: # na.rm = TRUE removes missing values # $p.value gives the p-value for the test t.test(x, y, na.rm=TRUE, paired=FALSE)$p.value # gives 0.5026467 t.test(x, z, na.rm=TRUE, paired=FALSE)$p.value # gives 0.0003166352 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stacking data
Hi I have data in the format below     Age       V1      V2    V3      V4   23646        45190 50333 55166 56271   26174        35535 38227 37911 41184   27723         25691 25712 26144 26398 and would like to sort it as follows               Age     values   ind                23646   45190   V1              26174  35535    V1               27723   25691   V2             27193   30949    V2 But i have 41 columns (age column + 40 individuals) I have the following script but an error is thrown up can anyone help, where am i going wrong zz - read.csv(Filename.csv,strip.white = TRUE) zzz - cbind(zz[gl(nrow(zz), 1, 40*nrow(zz)), 1], stack(zz[, 2:41])) ERROR MESSAGE Error in data.frame(..., check.names = FALSE) :  arguments imply differing number of rows: 3789080, 0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] structure Arrays
Hi Does anyone know how I can use structured arrays in r similar to a dataframe in matlab [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PCA HCA
Hi I am attempting PCA and HCA on a dataset The head of the table looks like this VariableSamp1Samp2Samp3 109232 276 352 222 244 I cant stop R from treating the 1st column as a sample Send instant messages to your online friends http://uk.messenger.yahoo.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HI
Does anyone know an easy way to convert all the zero values in a imported csv table into NA's [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dealing with NA's in a data matrix
Hi I have a matrix with NA value that I would like to convert these to a value of 0. any suggestions Kind Regards Amit Patel ___ ttp://uk.promotions.yahoo.com/forgood/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Doing PCA
Hi Fellow R enthusiasts I have managed to reshape my data using a much shorter script than before. Woohoo However now I have new problems. The code is below. There are no problems with the create matrix section. The problem code is highlighted in bold. I am trying to do PCA on the data. Here are the errors. Error1 code : OGSscaled = rangescale(OGS) error message : Error in dim(newX) - c(prod(d.call), d2) : dims [product 47960] do not match the length of object [43600] Error2 I tried to do PCA without rangescaling code :OGSpca - prcomp(OGS, center=FALSE) error message : Error in svd(x, nu = 0) : infinite or missing values in 'x' CODE ## ### Create matrix ### ## # load packages require(reshape) source(rangescale.r) #Open the csv file OGSdata - read.table(MG3199.csv,sep=,,header=TRUE) #create matrix x.m - melt(OGSdata, measure.var=pct) OGS - cast(x.m, mci ~ sample) ## ### PCA ## #scale profiles OGSscaled = rangescale(OGS) #do PCA result = prcomp(OGS, center=FALSE) #obtain scores matrix scores=result$rotation #PC1 vs PC2 plot plot(scores[,1], scores[,2], xlab=PC1, ylab=PC2) #add labels (0.005 and 0.003 offset to aviod obscuring points Kind Regards Amit Patel ___ now. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.