Re: [R] Matrix oriented computing

2005-08-26 Thread Achim Zeileis
On Fri, 26 Aug 2005 14:44:10 +0200 Sigbert Klinke wrote:

 Hi,
 
 I want to compute the quantiles of Chi^2 distributions with different 
 degrees of freedom like
 
 x-cbind(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99,
 0.995) df-rbind(1:100)
 m-qchisq(x,df)
 
 and hoped to get back  a  length(df) times length(x)  matrix with the 
 quantiles. Since this does not work, I use
 
 x-c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99,
 0.995) df-c(1:100)
 m-qchisq(x,df[1])
 for (i in 2:length(df)) {
   m-rbind(m,qchisq(x,df[i]))
 }
 dim(m)-c(length(df),length(x))
 
 Is there a way to avoid the for loop ?

You could use sapply():
  m - sapply(x, function(x) qchisq(x, df))
hth,
Z

 Thanks Sigbert
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread Marc Schwartz
On Fri, 2005-08-26 at 14:44 +0200, Sigbert Klinke wrote:
 Hi,
 
 I want to compute the quantiles of Chi^2 distributions with different 
 degrees of freedom like
 
 x-cbind(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995)
 df-rbind(1:100)
 m-qchisq(x,df)
 
 and hoped to get back  a  length(df) times length(x)  matrix with the 
 quantiles. Since this does not work, I use
 
 x-c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995)
 df-c(1:100)
 m-qchisq(x,df[1])
 for (i in 2:length(df)) {
   m-rbind(m,qchisq(x,df[i]))
 }
 dim(m)-c(length(df),length(x))
 
 Is there a way to avoid the for loop ?
 
 Thanks Sigbert

See ?sapply

x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 
   0.95, 0.975, 0.99, 0.995)

df - c(1:100)

mat - sapply(x, qchisq, df)

 dim(mat)
[1] 100  11
 
 str(mat)
 num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ...


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread Patrick Burns
I believe that the following is what you want:

x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995)
dof - 1:100
ans - outer(x, dof, qchisq)
dimnames(ans) - list(x, dof)


Note that 'df' is not a very auspicious name for an object since
it is the name of a function. 

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

Sigbert Klinke wrote:

Hi,

I want to compute the quantiles of Chi^2 distributions with different 
degrees of freedom like

x-cbind(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995)
df-rbind(1:100)
m-qchisq(x,df)

and hoped to get back  a  length(df) times length(x)  matrix with the 
quantiles. Since this does not work, I use

x-c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995)
df-c(1:100)
m-qchisq(x,df[1])
for (i in 2:length(df)) {
  m-rbind(m,qchisq(x,df[i]))
}
dim(m)-c(length(df),length(x))

Is there a way to avoid the for loop ?

Thanks Sigbert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread Peter Dalgaard
Marc Schwartz [EMAIL PROTECTED] writes:

 x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 
0.95, 0.975, 0.99, 0.995)
 
 df - c(1:100)
 
 mat - sapply(x, qchisq, df)

  dim(mat)
 [1] 100  11
  
  str(mat)
  num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ...

outer() is perhaps a more natural first try... It does give the
transpose of the sapply approach, though. 

round(t(outer(x,df,qchisq)),2)

should be close. You should likely add dimnames.


-- 
   O__   Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread Marc Schwartz (via MN)
On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote:
 Marc Schwartz [EMAIL PROTECTED] writes:
 
  x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 
 0.95, 0.975, 0.99, 0.995)
  
  df - c(1:100)
  
  mat - sapply(x, qchisq, df)
 
   dim(mat)
  [1] 100  11
   
   str(mat)
   num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ...
 
 outer() is perhaps a more natural first try... It does give the
 transpose of the sapply approach, though. 
 
 round(t(outer(x,df,qchisq)),2)
 
 should be close. You should likely add dimnames.



What I find interesting, is that I would have intuitively expected
outer() to be faster than sapply().  However:


  system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
[1] 0.01 0.00 0.01 0.00 0.00
 
  system.time(mat1 - round(t(outer(x, df, qchisq)), 2), 
   gcFirst = TRUE)
[1] 0.01 0.00 0.01 0.00 0.00
 
# No round() or t() to test for overhead
  system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
[1] 0.01 0.00 0.02 0.00 0.00


# Bear in mind the round() on mat1 above
 all.equal(mat, mat1)
[1] Mean relative  difference: 4.905485e-05

 all.equal(mat, t(mat2))
[1] TRUE


Even when increasing the size of 'df' to 1:1000:


  system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
[1] 0.16 0.01 0.16 0.00 0.00

  system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst =
TRUE)
[1] 0.16 0.00 0.18 0.00 0.00
 
  # No round() or t() to test for overhead
  system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
[1] 0.16 0.01 0.17 0.00 0.00



It also seems that, at least in this case, t() and round() do not add
much overhead.

Best regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread Prof Brian Ripley
Try profiling.  Doing this many times to get an overview, e.g. for sapply 
with df=1:1000:

%   self%   total
  self secondstotalsecondsname
  98.26  6.78 98.26  6.78 FUN
   0.58  0.04  0.58  0.04 unlist
   0.29  0.02  0.87  0.06 as.vector
   0.29  0.02  0.58  0.04 names-
   0.29  0.02  0.29  0.02 names-.default
   0.29  0.02  0.29  0.02 names

so almost all the time is in qchisq.


On Fri, 26 Aug 2005, Marc Schwartz (via MN) wrote:

 On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote:
 Marc Schwartz [EMAIL PROTECTED] writes:

 x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9,
0.95, 0.975, 0.99, 0.995)

 df - c(1:100)

 mat - sapply(x, qchisq, df)

 dim(mat)
 [1] 100  11

 str(mat)
  num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ...

 outer() is perhaps a more natural first try... It does give the
 transpose of the sapply approach, though.

 round(t(outer(x,df,qchisq)),2)

 should be close. You should likely add dimnames.



 What I find interesting, is that I would have intuitively expected
 outer() to be faster than sapply().  However:


  system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00

  system.time(mat1 - round(t(outer(x, df, qchisq)), 2),
   gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00

 # No round() or t() to test for overhead
  system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.01 0.00 0.02 0.00 0.00


 # Bear in mind the round() on mat1 above
 all.equal(mat, mat1)
 [1] Mean relative  difference: 4.905485e-05

 all.equal(mat, t(mat2))
 [1] TRUE


 Even when increasing the size of 'df' to 1:1000:


  system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.16 0.01 0.16 0.00 0.00

  system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst =
 TRUE)
 [1] 0.16 0.00 0.18 0.00 0.00

  # No round() or t() to test for overhead
  system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.16 0.01 0.17 0.00 0.00



 It also seems that, at least in this case, t() and round() do not add
 much overhead.

Definitely not for such small matrices.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread John Fox
Dear Mark,

For that matter, the loop isn't a whole a slower (on my 3GHz Win XP system):

 x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995)
 df - 1:1000
 system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
[1] 0.08 0.00 0.08   NA   NA
 
 mat - matrix(0, 1000, 11)
 system.time(for (i in 1:length(df)) mat[i,] - qchisq(x, df[i]))
[1] 0.09 0.00 0.10   NA   NA
 

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Marc 
 Schwartz (via MN)
 Sent: Friday, August 26, 2005 10:26 AM
 To: Peter Dalgaard
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Matrix oriented computing
 
 On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote:
  Marc Schwartz [EMAIL PROTECTED] writes:
  
   x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 
  0.95, 0.975, 0.99, 0.995)
   
   df - c(1:100)
   
   mat - sapply(x, qchisq, df)
  
dim(mat)
   [1] 100  11

str(mat)
num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 
 4.12e-01 ...
  
  outer() is perhaps a more natural first try... It does give the 
  transpose of the sapply approach, though.
  
  round(t(outer(x,df,qchisq)),2)
  
  should be close. You should likely add dimnames.
 
 
 
 What I find interesting, is that I would have intuitively expected
 outer() to be faster than sapply().  However:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00
  
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2),
gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00
  
 # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.01 0.00 0.02 0.00 0.00
 
 
 # Bear in mind the round() on mat1 above
  all.equal(mat, mat1)
 [1] Mean relative  difference: 4.905485e-05
 
  all.equal(mat, t(mat2))
 [1] TRUE
 
 
 Even when increasing the size of 'df' to 1:1000:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.16 0.01 0.16 0.00 0.00
 
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst =
 TRUE)
 [1] 0.16 0.00 0.18 0.00 0.00
  
   # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.16 0.01 0.17 0.00 0.00
 
 
 
 It also seems that, at least in this case, t() and round() do 
 not add much overhead.
 
 Best regards,
 
 Marc
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread Marc Schwartz (via MN)
Prof. Ripley,

Excellent point. Neither sapply() nor outer() are the elephant in the
room in this situation.



On Fri, 2005-08-26 at 16:55 +0100, Prof Brian Ripley wrote:
 Try profiling.  Doing this many times to get an overview, e.g. for sapply 
 with df=1:1000:
 
 %   self%   total
   self secondstotalsecondsname
   98.26  6.78 98.26  6.78 FUN
0.58  0.04  0.58  0.04 unlist
0.29  0.02  0.87  0.06 as.vector
0.29  0.02  0.58  0.04 names-
0.29  0.02  0.29  0.02 names-.default
0.29  0.02  0.29  0.02 names
 
 so almost all the time is in qchisq.
 
 
 On Fri, 26 Aug 2005, Marc Schwartz (via MN) wrote:
 
  On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote:
  Marc Schwartz [EMAIL PROTECTED] writes:
 
  x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9,
 0.95, 0.975, 0.99, 0.995)
 
  df - c(1:100)
 
  mat - sapply(x, qchisq, df)
 
  dim(mat)
  [1] 100  11
 
  str(mat)
   num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ...
 
  outer() is perhaps a more natural first try... It does give the
  transpose of the sapply approach, though.
 
  round(t(outer(x,df,qchisq)),2)
 
  should be close. You should likely add dimnames.
 
 
 
  What I find interesting, is that I would have intuitively expected
  outer() to be faster than sapply().  However:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
  [1] 0.01 0.00 0.01 0.00 0.00
 
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2),
gcFirst = TRUE)
  [1] 0.01 0.00 0.01 0.00 0.00
 
  # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
  [1] 0.01 0.00 0.02 0.00 0.00
 
 
  # Bear in mind the round() on mat1 above
  all.equal(mat, mat1)
  [1] Mean relative  difference: 4.905485e-05
 
  all.equal(mat, t(mat2))
  [1] TRUE
 
 
  Even when increasing the size of 'df' to 1:1000:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
  [1] 0.16 0.01 0.16 0.00 0.00
 
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst =
  TRUE)
  [1] 0.16 0.00 0.18 0.00 0.00
 
   # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
  [1] 0.16 0.01 0.17 0.00 0.00
 
 
 
  It also seems that, at least in this case, t() and round() do not add
  much overhead.
 
 Definitely not for such small matrices.


True and both are C functions, which of course helps as well.

Best regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread Gabor Grothendieck
On 8/26/05, Marc Schwartz (via MN) [EMAIL PROTECTED] wrote:
 On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote:
  Marc Schwartz [EMAIL PROTECTED] writes:
 
   x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9,
  0.95, 0.975, 0.99, 0.995)
  
   df - c(1:100)
  
   mat - sapply(x, qchisq, df)
  
dim(mat)
   [1] 100  11
  
str(mat)
num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ...
 
  outer() is perhaps a more natural first try... It does give the
  transpose of the sapply approach, though.
 
  round(t(outer(x,df,qchisq)),2)
 
  should be close. You should likely add dimnames.
 
 
 
 What I find interesting, is that I would have intuitively expected
 outer() to be faster than sapply().  However:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00
 
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2),
   gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00
 
 # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.01 0.00 0.02 0.00 0.00
 
 
 # Bear in mind the round() on mat1 above
  all.equal(mat, mat1)
 [1] Mean relative  difference: 4.905485e-05
 
  all.equal(mat, t(mat2))
 [1] TRUE
 
 
 Even when increasing the size of 'df' to 1:1000:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.16 0.01 0.16 0.00 0.00
 
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst =
 TRUE)
 [1] 0.16 0.00 0.18 0.00 0.00
 
   # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.16 0.01 0.17 0.00 0.00
 
 
 
 It also seems that, at least in this case, t() and round() do not add
 much overhead.
 

You might need to do it repeatedly to get a more reliable reading.
When I do it 1000 times outer is faster than sapply though not by much:

 n - 1000
 system.time(for (i in 1:n) mat - sapply(x, qchisq, df), gcFirst = TRUE)
[1] 14.05  0.00 14.43NANA
 
 system.time(for(i in 1:n) mat2 - outer(x, df, qchisq), gcFirst = TRUE)
[1] 13.42  0.00 13.85NANA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html