[R] eliminate t() and %*% using crossprod() and solve(A,b)

2005-10-06 Thread Katharine Mullen
you might want to see

Douglas Bates. Least squares calculations in R. R News,
4(1):17-20, June 2004.   http://CRAN.R-project.org/doc/Rnews/

he gives some rules of thumb, eg

use solve(A,b) not solve(A) %*% b
use crossprod(X) not t(X) %*% X
use crossprod(X,y) not t(X) y


Katharine Mullen
Department of Physics and Astronomy
Faculty of Sciences
Vrije Universiteit
de Boelelaan 1081
1081 HV Amsterdam
The Netherlands
room: T.1.06
tel: +31 205987870
fax: +31 205987992
e-mail: [EMAIL PROTECTED]
http://www.nat.vu.nl/~kate/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] eliminate t() and %*% using crossprod() and solve(A,b)

2005-10-05 Thread Robin Hankin
Hi

I have a square matrix Ainv of size N-by-N where N ~  1000
I have a rectangular matrix H of size N by n where n ~ 4.
I have a vector d of length N.

I need   X  = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d

and

H %*% X.

It is possible to rewrite X in the recommended crossprod way:

X -  solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d))

where quad.form() is a little function that evaluates a quadratic form:

  quad.form - function (M, x){
 jj - crossprod(M, x)
 return(drop(crossprod(jj, jj)))
}


QUESTION:

how  to calculate

H %*% X

in the recommended crossprod way?  (I don't want to take a transpose
because t() is expensive, and I know that %*% is slow).





--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] eliminate t() and %*% using crossprod() and solve(A,b)

2005-10-05 Thread Prof Brian Ripley
On Wed, 5 Oct 2005, Robin Hankin wrote:

 I have a square matrix Ainv of size N-by-N where N ~  1000
 I have a rectangular matrix H of size N by n where n ~ 4.
 I have a vector d of length N.

 I need   X  = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d

 and

 H %*% X.

 It is possible to rewrite X in the recommended crossprod way:

 X -  solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d))

 where quad.form() is a little function that evaluates a quadratic form:

  quad.form - function (M, x){
 jj - crossprod(M, x)
 return(drop(crossprod(jj, jj)))
 }

That is not the same thing: it gives t(H) %*% Ainv %*% t(Ainv) %*% H .

 QUESTION:

 how  to calculate

 H %*% X

 in the recommended crossprod way?  (I don't want to take a transpose
 because t() is expensive, and I know that %*% is slow).

Have you some data to support your claims?  Here I find (for random 
matrices of the dimensions given on a machine with a fast BLAS)

 system.time(for(i in 1:100) t(H) %*% Ainv)
[1] 2.19 0.01 2.21 0.00 0.00
 system.time(for(i in 1:100) crossprod(H, Ainv))
[1] 1.33 0.00 1.33 0.00 0.00

so each is quite fast and the difference is not great.  However, that is 
not comparing %*% with crossprod, but t  %*% with crossprod.

I get

 system.time(for(i in 1:1000) H %*% X)
[1] 0.05 0.01 0.06 0.00 0.00

which is hardly 'slow' (60 us for %*%), especially compared to forming X 
in

 system.time({X  = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d})
[1] 0.04 0.00 0.04 0.00 0.00

I would probably have written

 system.time({X - solve(crossprod(H, Ainv %*% H), crossprod(crossprod(Ainv, 
 H), d))})
1] 0.03 0.00 0.03 0.00 0.00

which is faster and does give the same answer.

[BTW, I used 2.2.0-beta which defaults to gcFirst=TRUE.]

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] eliminate t() and %*% using crossprod() and solve(A,b)

2005-10-05 Thread Robin Hankin

On 5 Oct 2005, at 12:15, Dimitris Rizopoulos wrote:

 Hi Robin,

 I've been playing with your question, but I think these two lines  
 are not equivalent:

 N - 1000
 n - 4
 Ainv - matrix(rnorm(N * N), N, N)
 H - matrix(rnorm(N * n), N, n)
 d - rnorm(N)
 quad.form - function (M, x){
 jj - crossprod(M, x)
 return(drop(crossprod(jj, jj)))
 }


 X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d
 X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d))
 all.equal(X0, X1) # not TRUE


 which is the one you want to compute?



These are not exactly equivalent, but:


Ainv - matrix(rnorm(1e6),1e3,1e3)
H - matrix(rnorm(4000),ncol=4)
d - rnorm(1000)

X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d
X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d))
X0 - X1
   [,1]
[1,]  4.884981e-15
[2,]  3.663736e-15
[3,] -5.107026e-15
[4,]  5.717649e-15

which is pretty close.



On 5 Oct 2005, at 12:50, Prof Brian Ripley wrote:

 QUESTION:

 how  to calculate

 H %*% X

 in the recommended crossprod way?  (I don't want to take a transpose
 because t() is expensive, and I know that %*% is slow).


 Have you some data to support your claims?  Here I find (for random
 matrices of the dimensions given on a machine with a fast BLAS)



I couldn't supply any performance data because I couldn't figure out the
correct R commands to calculate H %*% X  without using %*% or t()!

I was just wondering if there were a way to calculate

H %*% solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d

without using t() or %*%.  And there doesn't seem to be (my original
question didn't make it clear that I don't have X precalculated).

My take-home lesson from Brian Ripley is that H %*% X is fast
--but this is only useful to me if one has X precalculated, and in
general I don't.   But this discussion suggests to me that it might be
a good idea to change my routines and calculate X anyway.

thanks again Prof Ripley and Dimitris Rizopoulos


very best wishes

Robin



 system.time(for(i in 1:100) t(H) %*% Ainv)

 [1] 2.19 0.01 2.21 0.00 0.00

 system.time(for(i in 1:100) crossprod(H, Ainv))

 [1] 1.33 0.00 1.33 0.00 0.00

 so each is quite fast and the difference is not great.  However,  
 that is
 not comparing %*% with crossprod, but t  %*% with crossprod.

 I get


 system.time(for(i in 1:1000) H %*% X)

 [1] 0.05 0.01 0.06 0.00 0.00

 which is hardly 'slow' (60 us for %*%), especially compared to  
 forming X
 in


 system.time({X  = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*%  
 d})

 [1] 0.04 0.00 0.04 0.00 0.00

 I would probably have written


 system.time({X - solve(crossprod(H, Ainv %*% H), crossprod 
 (crossprod(Ainv, H), d))})

 1] 0.03 0.00 0.03 0.00 0.00

 which is faster and does give the same answer.

 [BTW, I used 2.2.0-beta which defaults to gcFirst=TRUE.]

 -- 





--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] eliminate t() and %*% using crossprod() and solve(A,b)

2005-10-05 Thread Dimitris Rizopoulos
an alternative is to use X2 below, which seems to be a little bit 
faster:

N - 1000
n - 4
Ainv - matrix(rnorm(N * N), N, N)
H - matrix(rnorm(N * n), N, n)
d - rnorm(N)
quad.form - function (M, x){
 jj - crossprod(M, x)
 return(drop(crossprod(jj, jj)))
}

###
###

invisible({gc(); gc(); gc()})
system.time(for(i in 1:200){
X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d
}, gcFirst = TRUE)


invisible({gc(); gc(); gc()})
system.time(for(i in 1:200){
X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d))
}, gcFirst = TRUE)


invisible({gc(); gc(); gc()})
system.time(for(i in 1:200){
tH.Ainv - crossprod(H, Ainv)
X2 -  solve(tH.Ainv %*% H) %*% colSums(t(tH.Ainv) * d)
}, gcFirst = TRUE)


all.equal(X0, X1)
all.equal(X0, X2)


I hope this helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Robin Hankin [EMAIL PROTECTED]
To: Prof Brian Ripley [EMAIL PROTECTED]
Cc: RHelp r-help@stat.math.ethz.ch; Robin Hankin 
[EMAIL PROTECTED]
Sent: Wednesday, October 05, 2005 3:08 PM
Subject: Re: [R] eliminate t() and %*% using crossprod() and 
solve(A,b)



 On 5 Oct 2005, at 12:15, Dimitris Rizopoulos wrote:

 Hi Robin,

 I've been playing with your question, but I think these two lines
 are not equivalent:

 N - 1000
 n - 4
 Ainv - matrix(rnorm(N * N), N, N)
 H - matrix(rnorm(N * n), N, n)
 d - rnorm(N)
 quad.form - function (M, x){
 jj - crossprod(M, x)
 return(drop(crossprod(jj, jj)))
 }


 X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d
 X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d))
 all.equal(X0, X1) # not TRUE


 which is the one you want to compute?



 These are not exactly equivalent, but:


 Ainv - matrix(rnorm(1e6),1e3,1e3)
 H - matrix(rnorm(4000),ncol=4)
 d - rnorm(1000)

 X0 - solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d
 X1 - solve(quad.form(Ainv, H), crossprod(crossprod(Ainv, H), d))
 X0 - X1
   [,1]
 [1,]  4.884981e-15
 [2,]  3.663736e-15
 [3,] -5.107026e-15
 [4,]  5.717649e-15

 which is pretty close.



 On 5 Oct 2005, at 12:50, Prof Brian Ripley wrote:

 QUESTION:

 how  to calculate

 H %*% X

 in the recommended crossprod way?  (I don't want to take a 
 transpose
 because t() is expensive, and I know that %*% is slow).


 Have you some data to support your claims?  Here I find (for random
 matrices of the dimensions given on a machine with a fast BLAS)



 I couldn't supply any performance data because I couldn't figure out 
 the
 correct R commands to calculate H %*% X  without using %*% or t()!

 I was just wondering if there were a way to calculate

 H %*% solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*% d

 without using t() or %*%.  And there doesn't seem to be (my original
 question didn't make it clear that I don't have X precalculated).

 My take-home lesson from Brian Ripley is that H %*% X is fast
 --but this is only useful to me if one has X precalculated, and in
 general I don't.   But this discussion suggests to me that it might 
 be
 a good idea to change my routines and calculate X anyway.

 thanks again Prof Ripley and Dimitris Rizopoulos


 very best wishes

 Robin



 system.time(for(i in 1:100) t(H) %*% Ainv)

 [1] 2.19 0.01 2.21 0.00 0.00

 system.time(for(i in 1:100) crossprod(H, Ainv))

 [1] 1.33 0.00 1.33 0.00 0.00

 so each is quite fast and the difference is not great.  However,
 that is
 not comparing %*% with crossprod, but t  %*% with crossprod.

 I get


 system.time(for(i in 1:1000) H %*% X)

 [1] 0.05 0.01 0.06 0.00 0.00

 which is hardly 'slow' (60 us for %*%), especially compared to
 forming X
 in


 system.time({X  = solve(t(H) %*% Ainv %*% H) %*% t(H) %*% Ainv %*%
 d})

 [1] 0.04 0.00 0.04 0.00 0.00

 I would probably have written


 system.time({X - solve(crossprod(H, Ainv %*% H), crossprod
 (crossprod(Ainv, H), d))})

 1] 0.03 0.00 0.03 0.00 0.00

 which is faster and does give the same answer.

 [BTW, I used 2.2.0-beta which defaults to gcFirst=TRUE.]

 -- 





 --
 Robin Hankin
 Uncertainty Analyst
 National Oceanography Centre, Southampton
 European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html