[R] eliminate t() and %*% using crossprod() and solve(A,b)
you might want to see Douglas Bates. Least squares calculations in R. R News, 4(1):17-20, June 2004. http://CRAN.R-project.org/doc/Rnews/ he gives some rules of thumb, eg use solve(A,b) not solve(A) %*% b use crossprod(X) not t(X) %*% X use crossprod(X,y) not t(X) y Katharine Mullen Department of Physics and Astronomy Faculty of Sciences Vrije Universiteit de Boelelaan 1081 1081 HV Amsterdam The Netherlands room: T.1.06 tel: +31 205987870 fax: +31 205987992 e-mail: [EMAIL PROTECTED] http://www.nat.vu.nl/~kate/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] eliminate t() and %*% using crossprod() and solve(A,b)
Hi I have a square matrix Ainv of size N-by-N where N ~ 1000 I have a rectangular matrix H of size N by n where n ~ 4. I have a vector d of length N. I need X = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d and H %*% X. It is possible to rewrite X in the recommended crossprod way: X - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d)) where quad.form() is a little function that evaluates a quadratic form: quad.form - function (M, x){ jj - crossprod(M, x) return(drop(crossprod(jj, jj))) } QUESTION: how to calculate H %*% X in the recommended crossprod way? (I don't want to take a transpose because t() is expensive, and I know that %*% is slow). -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] eliminate t() and %*% using crossprod() and solve(A,b)
On Wed, 5 Oct 2005, Robin Hankin wrote: I have a square matrix Ainv of size N-by-N where N ~ 1000 I have a rectangular matrix H of size N by n where n ~ 4. I have a vector d of length N. I need X = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d and H %*% X. It is possible to rewrite X in the recommended crossprod way: X - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d)) where quad.form() is a little function that evaluates a quadratic form: quad.form - function (M, x){ jj - crossprod(M, x) return(drop(crossprod(jj, jj))) } That is not the same thing: it gives t(H) %*% Ainv %*% t(Ainv) %*% H . QUESTION: how to calculate H %*% X in the recommended crossprod way? (I don't want to take a transpose because t() is expensive, and I know that %*% is slow). Have you some data to support your claims? Here I find (for random matrices of the dimensions given on a machine with a fast BLAS) system.time(for(i in 1:100) t(H) %*% Ainv) [1] 2.19 0.01 2.21 0.00 0.00 system.time(for(i in 1:100) crossprod(H, Ainv)) [1] 1.33 0.00 1.33 0.00 0.00 so each is quite fast and the difference is not great. However, that is not comparing %*% with crossprod, but t %*% with crossprod. I get system.time(for(i in 1:1000) H %*% X) [1] 0.05 0.01 0.06 0.00 0.00 which is hardly 'slow' (60 us for %*%), especially compared to forming X in system.time({X = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d}) [1] 0.04 0.00 0.04 0.00 0.00 I would probably have written system.time({X - solve(crossprod(H, Ainv %*% H), crossprod(crossprod(Ainv, H), d))}) 1] 0.03 0.00 0.03 0.00 0.00 which is faster and does give the same answer. [BTW, I used 2.2.0-beta which defaults to gcFirst=TRUE.] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] eliminate t() and %*% using crossprod() and solve(A,b)
On 5 Oct 2005, at 12:15, Dimitris Rizopoulos wrote: Hi Robin, I've been playing with your question, but I think these two lines are not equivalent: N - 1000 n - 4 Ainv - matrix(rnorm(N * N), N, N) H - matrix(rnorm(N * n), N, n) d - rnorm(N) quad.form - function (M, x){ jj - crossprod(M, x) return(drop(crossprod(jj, jj))) } X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d)) all.equal(X0, X1) # not TRUE which is the one you want to compute? These are not exactly equivalent, but: Ainv - matrix(rnorm(1e6),1e3,1e3) H - matrix(rnorm(4000),ncol=4) d - rnorm(1000) X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d)) X0 - X1 [,1] [1,] 4.884981e-15 [2,] 3.663736e-15 [3,] -5.107026e-15 [4,] 5.717649e-15 which is pretty close. On 5 Oct 2005, at 12:50, Prof Brian Ripley wrote: QUESTION: how to calculate H %*% X in the recommended crossprod way? (I don't want to take a transpose because t() is expensive, and I know that %*% is slow). Have you some data to support your claims? Here I find (for random matrices of the dimensions given on a machine with a fast BLAS) I couldn't supply any performance data because I couldn't figure out the correct R commands to calculate H %*% X without using %*% or t()! I was just wondering if there were a way to calculate H %*% solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d without using t() or %*%. And there doesn't seem to be (my original question didn't make it clear that I don't have X precalculated). My take-home lesson from Brian Ripley is that H %*% X is fast --but this is only useful to me if one has X precalculated, and in general I don't. But this discussion suggests to me that it might be a good idea to change my routines and calculate X anyway. thanks again Prof Ripley and Dimitris Rizopoulos very best wishes Robin system.time(for(i in 1:100) t(H) %*% Ainv) [1] 2.19 0.01 2.21 0.00 0.00 system.time(for(i in 1:100) crossprod(H, Ainv)) [1] 1.33 0.00 1.33 0.00 0.00 so each is quite fast and the difference is not great. However, that is not comparing %*% with crossprod, but t %*% with crossprod. I get system.time(for(i in 1:1000) H %*% X) [1] 0.05 0.01 0.06 0.00 0.00 which is hardly 'slow' (60 us for %*%), especially compared to forming X in system.time({X = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d}) [1] 0.04 0.00 0.04 0.00 0.00 I would probably have written system.time({X - solve(crossprod(H, Ainv %*% H), crossprod (crossprod(Ainv, H), d))}) 1] 0.03 0.00 0.03 0.00 0.00 which is faster and does give the same answer. [BTW, I used 2.2.0-beta which defaults to gcFirst=TRUE.] -- -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] eliminate t() and %*% using crossprod() and solve(A,b)
an alternative is to use X2 below, which seems to be a little bit faster: N - 1000 n - 4 Ainv - matrix(rnorm(N * N), N, N) H - matrix(rnorm(N * n), N, n) d - rnorm(N) quad.form - function (M, x){ jj - crossprod(M, x) return(drop(crossprod(jj, jj))) } ### ### invisible({gc(); gc(); gc()}) system.time(for(i in 1:200){ X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d }, gcFirst = TRUE) invisible({gc(); gc(); gc()}) system.time(for(i in 1:200){ X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d)) }, gcFirst = TRUE) invisible({gc(); gc(); gc()}) system.time(for(i in 1:200){ tH.Ainv - crossprod(H, Ainv) X2 - solve(tH.Ainv %*% H) %*% colSums(t(tH.Ainv) * d) }, gcFirst = TRUE) all.equal(X0, X1) all.equal(X0, X2) I hope this helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://www.med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Robin Hankin [EMAIL PROTECTED] To: Prof Brian Ripley [EMAIL PROTECTED] Cc: RHelp r-help@stat.math.ethz.ch; Robin Hankin [EMAIL PROTECTED] Sent: Wednesday, October 05, 2005 3:08 PM Subject: Re: [R] eliminate t() and %*% using crossprod() and solve(A,b) On 5 Oct 2005, at 12:15, Dimitris Rizopoulos wrote: Hi Robin, I've been playing with your question, but I think these two lines are not equivalent: N - 1000 n - 4 Ainv - matrix(rnorm(N * N), N, N) H - matrix(rnorm(N * n), N, n) d - rnorm(N) quad.form - function (M, x){ jj - crossprod(M, x) return(drop(crossprod(jj, jj))) } X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d)) all.equal(X0, X1) # not TRUE which is the one you want to compute? These are not exactly equivalent, but: Ainv - matrix(rnorm(1e6),1e3,1e3) H - matrix(rnorm(4000),ncol=4) d - rnorm(1000) X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d)) X0 - X1 [,1] [1,] 4.884981e-15 [2,] 3.663736e-15 [3,] -5.107026e-15 [4,] 5.717649e-15 which is pretty close. On 5 Oct 2005, at 12:50, Prof Brian Ripley wrote: QUESTION: how to calculate H %*% X in the recommended crossprod way? (I don't want to take a transpose because t() is expensive, and I know that %*% is slow). Have you some data to support your claims? Here I find (for random matrices of the dimensions given on a machine with a fast BLAS) I couldn't supply any performance data because I couldn't figure out the correct R commands to calculate H %*% X without using %*% or t()! I was just wondering if there were a way to calculate H %*% solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d without using t() or %*%. And there doesn't seem to be (my original question didn't make it clear that I don't have X precalculated). My take-home lesson from Brian Ripley is that H %*% X is fast --but this is only useful to me if one has X precalculated, and in general I don't. But this discussion suggests to me that it might be a good idea to change my routines and calculate X anyway. thanks again Prof Ripley and Dimitris Rizopoulos very best wishes Robin system.time(for(i in 1:100) t(H) %*% Ainv) [1] 2.19 0.01 2.21 0.00 0.00 system.time(for(i in 1:100) crossprod(H, Ainv)) [1] 1.33 0.00 1.33 0.00 0.00 so each is quite fast and the difference is not great. However, that is not comparing %*% with crossprod, but t %*% with crossprod. I get system.time(for(i in 1:1000) H %*% X) [1] 0.05 0.01 0.06 0.00 0.00 which is hardly 'slow' (60 us for %*%), especially compared to forming X in system.time({X = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d}) [1] 0.04 0.00 0.04 0.00 0.00 I would probably have written system.time({X - solve(crossprod(H, Ainv %*% H), crossprod (crossprod(Ainv, H), d))}) 1] 0.03 0.00 0.03 0.00 0.00 which is faster and does give the same answer. [BTW, I used 2.2.0-beta which defaults to gcFirst=TRUE.] -- -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html