RE: [R] How to extract x rows to get x pvalues using t.test
On Tue, 15 Mar 2005, Liaw, Andy wrote: From: Adaikalavan Ramasamy You will need to _apply_ the t-test row by row. apply( genes, 1, function(x) t.test( x[1:2], x[3:4] )$p.value ) apply() is a C optimised version of for. Running the above code on a dataset with 56000 rows and 4 columns took about 63 seconds on my 1.6 GHz Pentium machine with 512 Mb RAM. See help(apply) for more details. That's not true. In R, there's a for loop hidden inside apply() (just look at the source). In S-PLUS, C level looping is done in some situations, and for others lapply() is used. It's slightly more complicated than this. lapply() really is a C-level loop and apply() eventually calls it. Now, whatever happends inside apply(), it still true that t.test() has to be called 56,000 times, providing a lower bound on the time apply() can take. In this case I would be very surprised if apply() saved any time. What would save time is writing a stripped-down t-test function, especially as only the p-value is being used. The real problem with apply is that when the objects involved are large, apply() can be substantially slower because of greater memory use. As a concrete example, an apply() on a 1x757 set of replicate weights in the survey package used half as much memory when turned into a for() loop. As a result it ran several times faster on my laptop (where it was paging heavily) and slightly faster on my desktop (which has rather more memory). -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to extract x rows to get x pvalues using t.test
Thanks to everyone who posted on this topic. I tried apply() as Ramasamy had suggested and it took 40 seconds on my machine. The for loop however took over 4 minutes and i gave up. I am going to strip the t.test function and write it as suggested by Andy. Hope that will be the quickest. Once again thank you for all who have posted on this. Choudary Jagarlamudi Instructor Computer Science Southwestern Oklahoma State University STF 254 100 campus Drive Weatherford OK 73096 Tel 580-774-7136 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to extract x rows to get x pvalues using t.test
Hi, if a striped down version of t.test is required for speed, before implementing a rewrite I would check out the multtest Bioconductor package. It takes a fraction of a second to do 56000 t-tests on a 2.4 Ghz PIV. dim(X) [1] 56000 4 pcols - 4 replicates - rep(0:1, each=2) unix.time({tscores - mt.teststat(X, classlabel=replicates, test=t.equalvar) dfs - pcols - 2 pvalues - 2*(1-pt(abs(tscores), df=dfs))}) [1] 0.21 0.05 0.26 0.00 0.00 length(pvalues) [1] 56000 Check out ?mt.teststat and its test argument options. Marcus Jagarlamudi, Choudary [EMAIL PROTECTED] 17/03/2005 7:16:13 a.m. Thanks to everyone who posted on this topic. I tried apply() as Ramasamy had suggested and it took 40 seconds on my machine. The for loop however took over 4 minutes and i gave up. I am going to strip the t.test function and write it as suggested by Andy. Hope that will be the quickest. Once again thank you for all who have posted on this. Choudary Jagarlamudi Instructor Computer Science Southwestern Oklahoma State University STF 254 100 campus Drive Weatherford OK 73096 Tel 580-774-7136 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ The contents of this e-mail are privileged and/or confidenti...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to extract x rows to get x pvalues using t.test
You will need to _apply_ the t-test row by row. apply( genes, 1, function(x) t.test( x[1:2], x[3:4] )$p.value ) apply() is a C optimised version of for. Running the above code on a dataset with 56000 rows and 4 columns took about 63 seconds on my 1.6 GHz Pentium machine with 512 Mb RAM. See help(apply) for more details. Regards, Adai On Tue, 2005-03-15 at 11:59 -0600, Jagarlamudi, Choudary wrote: Hi all, My data genes [,1] [,2] [,3] [,4] [1,] 25 72 23 55 [2,] 34 53 41 33 [3,] 26 43 26 44 [4,] 36 64 64 22 [5,] 47 72 67 34 stu-t.test(genes[,1:2],genes[,3:4]) stu$p.value [1] 0.4198002 i get 1 pvalue for the entire col1:col2 Vs col3:col4. I am trying to get 5 p values for the 5 rows i have. I am trying to avoid a for loop coz my actual data has 56000 rows and its taking more than 4 minutes to compute. Thanks in advance. Choudary Jagarlamudi Instructor Southwestern Oklahoma State University STF 254 100 campus Drive Weatherford OK 73096 Tel 580-774-7136 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to extract x rows to get x pvalues using t.test
From: Adaikalavan Ramasamy You will need to _apply_ the t-test row by row. apply( genes, 1, function(x) t.test( x[1:2], x[3:4] )$p.value ) apply() is a C optimised version of for. Running the above code on a dataset with 56000 rows and 4 columns took about 63 seconds on my 1.6 GHz Pentium machine with 512 Mb RAM. See help(apply) for more details. That's not true. In R, there's a for loop hidden inside apply() (just look at the source). In S-PLUS, C level looping is done in some situations, and for others lapply() is used. Andy Regards, Adai On Tue, 2005-03-15 at 11:59 -0600, Jagarlamudi, Choudary wrote: Hi all, My data genes [,1] [,2] [,3] [,4] [1,] 25 72 23 55 [2,] 34 53 41 33 [3,] 26 43 26 44 [4,] 36 64 64 22 [5,] 47 72 67 34 stu-t.test(genes[,1:2],genes[,3:4]) stu$p.value [1] 0.4198002 i get 1 pvalue for the entire col1:col2 Vs col3:col4. I am trying to get 5 p values for the 5 rows i have. I am trying to avoid a for loop coz my actual data has 56000 rows and its taking more than 4 minutes to compute. Thanks in advance. Choudary Jagarlamudi Instructor Southwestern Oklahoma State University STF 254 100 campus Drive Weatherford OK 73096 Tel 580-774-7136 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html