[R] Is it possible to vectorize/accelerate this?

2011-11-03 Thread hihi
Dear Members,

I work on a simulaton experiment but it has an bottleneck. It's quite fast 
because of R and vectorizing, but it has a very slow for loop. The adjacent 
element of a vector (in terms of index number) depends conditionally on the 
former value of itself. Like a simple cumulating function (eg. cumsum) but with 
condition. Let's show me an example:
a_vec = rnorm(100)
b_vec = rep(0, 100)
b_vec[1]=a_vec[1]
for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])1, a_vec[i], 
b_vec[i-1]+a_vec[i])}
print(b_vec)

(The behaviour is like cumsum's, but when the value would excess 1.0 then it 
has another value from a_vec.)
Is it possible to make this faster? I experienced that my way is even slower 
than in Excel! Programming in C would my last try...
Any suggestions?

Than you,
Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to vectorize/accelerate this?

2011-11-03 Thread R. Michael Weylandt michael.weyla...@gmail.com
I don't immediately see a good trick for vectorization so this seems to me to 
be a good candidate for work in a lower-level language. Staying within R, I'd 
suggest you use if and else rather than ifelse() since your computation isn't 
vectorized: this will eliminate a small amount over overhead. Since you also 
always add a_vec, you could also define b_vec as a copy of a to avoid all those 
calls to subset a, but I don't think the effects will be large and the code 
might not be as clear. 

You indicated that you may be comfortable with writing C, but I'd suggest you 
look into the Rcpp/Inline package pair which make the whole process much easier 
than it would otherwise be.

 I'm not at a computer write now or I'd write a fuller example, but the 
documentation for those packages is uncommonly good an you should be able to 
easily get it down into C++. If you aren't able to get it by tomorrow, let me 
know and I can help troubleshoot. The only things I foresee that you'll need to 
change are zero-basing, C's loop syntax, and (I think) the call to abs(). (I 
always forget where abs() lives in c++ )

The only possible hold up is that you need to be at a computer with a C compiler

Hope this helps,

Michael 

On Nov 3, 2011, at 3:10 PM, hihi v.p.m...@freemail.hu wrote:

 Dear Members,
 
 I work on a simulaton experiment but it has an bottleneck. It's quite fast 
 because of R and vectorizing, but it has a very slow for loop. The adjacent 
 element of a vector (in terms of index number) depends conditionally on the 
 former value of itself. Like a simple cumulating function (eg. cumsum) but 
 with condition. Let's show me an example:
 a_vec = rnorm(100)
 b_vec = rep(0, 100)
 b_vec[1]=a_vec[1]
 for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])1, a_vec[i], 
 b_vec[i-1]+a_vec[i])}
 print(b_vec)
 
 (The behaviour is like cumsum's, but when the value would excess 1.0 then it 
 has another value from a_vec.)
 Is it possible to make this faster? I experienced that my way is even slower 
 than in Excel! Programming in C would my last try...
 Any suggestions?
 
 Than you,
 Peter
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to vectorize/accelerate this?

2011-11-03 Thread hihi
Thank you, I will try as soon as possible...
Regards,
Peter

2011/11/3 Mark Leeds marklee...@gmail.com

 hi hihi: you're not using the ifelse construct correctly because it's
 already vectorized
 so there's no need to use a loop.

 check if below works AND if it's fast enough because I didn't check either
 one. Also,
 i bet someone else can send something better so I would wait anyway. good
 luck.

 set.seed(1)
 avec = rnorm(100)
 bvec = rep(0, 100)
 bvec[1]=avec[1]

 bveclagged - c(999,head(bvec,-1))
 bvec - ifelse((abs(bveclagged + avec)  1), avec, bveclagged + avec)





 On Thu, Nov 3, 2011 at 7:10 PM, hihi v.p.m...@freemail.hu wrote:

 Dear Members,

 I work on a simulaton experiment but it has an bottleneck. It's quite
 fast because of R and vectorizing, but it has a very slow for loop. The
 adjacent element of a vector (in terms of index number) depends
 conditionally on the former value of itself. Like a simple cumulating
 function (eg. cumsum) but with condition. Let's show me an example:
 a_vec = rnorm(100)
 b_vec = rep(0, 100)
 b_vec[1]=a_vec[1]
 for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])1, a_vec[i],
 b_vec[i-1]+a_vec[i])}
 print(b_vec)

 (The behaviour is like cumsum's, but when the value would excess 1.0 then
 it has another value from a_vec.)
 Is it possible to make this faster? I experienced that my way is even
 slower than in Excel! Programming in C would my last try...
 Any suggestions?

 Than you,
 Peter

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to vectorize/accelerate this?

2011-11-03 Thread Dennis Murphy
Hi:

You're doing the right thing in R by pre-allocating memory for the
result, but ifelse() is a vectorized function and your loop is
operating elementwise, so if-else is more appropriate. Try

for (i in 2:100){
b_vec[i] - if(abs(b_vec[i-1] + a_vec[i])  1) a_vec[i] else
b_vec[i-1] + a_vec[i]
  }

If speed is an issue, then I echo Michael's suggestion to write a
C(++) function and call it within R. The inline package is good for
this kind of thing.

HTH,
Dennis


On Thu, Nov 3, 2011 at 12:10 PM, hihi v.p.m...@freemail.hu wrote:
 Dear Members,

 I work on a simulaton experiment but it has an bottleneck. It's quite fast 
 because of R and vectorizing, but it has a very slow for loop. The adjacent 
 element of a vector (in terms of index number) depends conditionally on the 
 former value of itself. Like a simple cumulating function (eg. cumsum) but 
 with condition. Let's show me an example:
 a_vec = rnorm(100)
 b_vec = rep(0, 100)
 b_vec[1]=a_vec[1]
 for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])1, a_vec[i], 
 b_vec[i-1]+a_vec[i])}
 print(b_vec)

 (The behaviour is like cumsum's, but when the value would excess 1.0 then it 
 has another value from a_vec.)
 Is it possible to make this faster? I experienced that my way is even slower 
 than in Excel! Programming in C would my last try...
 Any suggestions?

 Than you,
 Peter

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to vectorize/accelerate this?

2011-11-03 Thread R. Michael Weylandt
Yes -- if  else is much faster than ifelse() because if is a
primitive while ifelse() is a whole function call (in fact, you can
see the code by typing ifelse into the prompt and see that it has two
if calls within it.

Michael

On Thu, Nov 3, 2011 at 4:38 PM, hihi v.p.m...@freemail.hu wrote:
 Hi,
 thank you for your very immediate response. :-) Is if than and else faster
 than ifelse? I'm wondering (or not knowing something)
 Best regards,
 Peter
 2011/11/3 R. Michael Weylandt michael.weyla...@gmail.com
 michael.weyla...@gmail.com

 I don't immediately see a good trick for vectorization so this seems to me
 to be a good candidate for work in a lower-level language. Staying within R,
 I'd suggest you use if and else rather than ifelse() since your computation
 isn't vectorized: this will eliminate a small amount over overhead. Since
 you also always add a_vec, you could also define b_vec as a copy of a to
 avoid all those calls to subset a, but I don't think the effects will be
 large and the code might not be as clear.

 You indicated that you may be comfortable with writing C, but I'd suggest
 you look into the Rcpp/Inline package pair which make the whole process much
 easier than it would otherwise be.

  I'm not at a computer write now or I'd write a fuller example, but the
 documentation for those packages is uncommonly good an you should be able to
 easily get it down into C++. If you aren't able to get it by tomorrow, let
 me know and I can help troubleshoot. The only things I foresee that you'll
 need to change are zero-basing, C's loop syntax, and (I think) the call to
 abs(). (I always forget where abs() lives in c++ )

 The only possible hold up is that you need to be at a computer with a C
 compiler

 Hope this helps,

 Michael

 On Nov 3, 2011, at 3:10 PM, hihi v.p.m...@freemail.hu wrote:

  Dear Members,
 
  I work on a simulaton experiment but it has an bottleneck. It's quite
  fast because of R and vectorizing, but it has a very slow for loop. The
  adjacent element of a vector (in terms of index number) depends
  conditionally on the former value of itself. Like a simple cumulating
  function (eg. cumsum) but with condition. Let's show me an example:
  a_vec = rnorm(100)
  b_vec = rep(0, 100)
  b_vec[1]=a_vec[1]
  for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])1, a_vec[i],
  b_vec[i-1]+a_vec[i])}
  print(b_vec)
 
  (The behaviour is like cumsum's, but when the value would excess 1.0
  then it has another value from a_vec.)
  Is it possible to make this faster? I experienced that my way is even
  slower than in Excel! Programming in C would my last try...
  Any suggestions?
 
  Than you,
  Peter
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to vectorize/accelerate this?

2011-11-03 Thread William Dunlap
You should get familiar with some basic timing tools
and techniques so you can investigate things like this
yourself.

system.time is the most basic timing tool.  E.g.,
   system.time(for(i in 1:1000)f0(a))
 user  system elapsed
   22.920   0.000  22.932
means it took c. 23 seconds of real time to run f0(a)
1000 times.

When comparing timing, it makes things easier to define
a series of functions that implement the various algorithms
but have the same inputs and outputs.  E.g., for your problem

f0 - function(a_vec) {
b_vec - a_vec
for (i in 2:length(b_vec)){
b_vec[i] - ifelse(abs(b_vec[i-1] + a_vec[i])  1, a_vec[i], b_vec[i-1] 
+ a_vec[i])
}
b_vec
}

f1 - function(a_vec) {
b_vec - a_vec
for (i in 2:length(b_vec)){
b_vec[i] - if(abs(b_vec[i-1] + a_vec[i])  1) a_vec[i] else b_vec[i-1] 
+ a_vec[i]
}
b_vec
}

f2 - function(a_vec) {
b_vec - a_vec
for (i in 2:length(b_vec)){
if(abs(s - b_vec[i-1] + a_vec[i]) = 1) b_vec[i] - s
}
b_vec
}

Then run them with the same dataset:
 a - runif(1000, 0, .3)
 system.time(for(i in 1:1000)f0(a))
   user  system elapsed
 22.920   0.000  22.932
 system.time(for(i in 1:1000)f1(a))
   user  system elapsed
  5.510   0.000   5.514
 system.time(for(i in 1:1000)f2(a))
   user  system elapsed
  4.210   0.000   4.217

(The rbenchmark package's benchmark function encapsulates this idiom.) 

It pays to use a dataset similar to the one you will ultimately be using,
where similar depends on the context.  E.g., the algorithm in f2 is relatively
faster when the cumsum exceeds 1 most of the time

 a - runif(1000, 0, 10)
 system.time(for(i in 1:1000)f0(a))
   user  system elapsed
 21.900   0.000  21.912
 system.time(for(i in 1:1000)f1(a))
   user  system elapsed
  4.610   0.000   4.609
 system.time(for(i in 1:1000)f2(a))
   user  system elapsed
  2.490   0.000   2.494

If you will be working with large datasets, you should look at how the
time grows as the size of the dataset grows.  If the time looks quadratic 
between,
say, length 100 and length 200, don't waste your time testing it for length 
100.
For algorithms that work on data.frames (or matrices), the relative speed ofen
depends on the ratio of the number of rows and the number of columns of data.
Check that out.  For these sorts of tests it is worthwhile to make a function to
generate typical looking data of any desired size.
 
It doesn't take too long to do this once you have the right mindset.  Once you
do you don't have to rely on folklore like never use loops and instead do
evidence-based computing.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of R. Michael
 Weylandt
 Sent: Thursday, November 03, 2011 2:51 PM
 To: hihi; r-help
 Subject: Re: [R] Is it possible to vectorize/accelerate this?
 
 Yes -- if  else is much faster than ifelse() because if is a
 primitive while ifelse() is a whole function call (in fact, you can
 see the code by typing ifelse into the prompt and see that it has two
 if calls within it.
 
 Michael
 
 On Thu, Nov 3, 2011 at 4:38 PM, hihi v.p.m...@freemail.hu wrote:
  Hi,
  thank you for your very immediate response. :-) Is if than and else faster
  than ifelse? I'm wondering (or not knowing something)
  Best regards,
  Peter
  2011/11/3 R. Michael Weylandt michael.weyla...@gmail.com
  michael.weyla...@gmail.com
 
  I don't immediately see a good trick for vectorization so this seems to me
  to be a good candidate for work in a lower-level language. Staying within 
  R,
  I'd suggest you use if and else rather than ifelse() since your computation
  isn't vectorized: this will eliminate a small amount over overhead. Since
  you also always add a_vec, you could also define b_vec as a copy of a to
  avoid all those calls to subset a, but I don't think the effects will be
  large and the code might not be as clear.
 
  You indicated that you may be comfortable with writing C, but I'd suggest
  you look into the Rcpp/Inline package pair which make the whole process 
  much
  easier than it would otherwise be.
 
   I'm not at a computer write now or I'd write a fuller example, but the
  documentation for those packages is uncommonly good an you should be able 
  to
  easily get it down into C++. If you aren't able to get it by tomorrow, let
  me know and I can help troubleshoot. The only things I foresee that you'll
  need to change are zero-basing, C's loop syntax, and (I think) the call to
  abs(). (I always forget where abs() lives in c++ )
 
  The only possible hold up is that you need to be at a computer with a C
  compiler
 
  Hope this helps,
 
  Michael
 
  On Nov 3, 2011, at 3:10 PM, hihi v.p.m...@freemail.hu wrote:
 
   Dear Members,
  
   I work on a simulaton experiment but it has an bottleneck. It's quite
   fast because of R and vectorizing, but it has a very slow for loop. The
   adjacent

Re: [R] Is it possible to vectorize/accelerate this?

2011-11-03 Thread Carl Witthoft
I have to admit to not doing careful timing tests, but I often eliminate 
if() lines as follows (bad/good is just my preference)


BAD:  b[i] - if(a[i]1) a[i] else a[i-1]

GOOD:  b[i] - a[i]* (a[i]1) + a[i-1] * (a[i]=1)




On Thu, Nov 3, 2011 at 12:10 PM, hihi v.p.mail_at_freemail.hu wrote:
 Dear Members,

 I work on a simulaton experiment but it has an bottleneck. It's quite 
fast because of R and vectorizing, but it has a very slow for loop. The 
adjacent element of a vector (in terms of index number) depends 
conditionally on the former value of itself. Like a simple cumulating 
function (eg. cumsum) but with condition. Let's show me an example:

 a_vec = rnorm(100)
 b_vec = rep(0, 100)
 b_vec[1]=a_vec[1]
 for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])1, 
a_vec[i], b_vec[i-1]+a_vec[i])}

 print(b_vec)
--

Sent from my Cray XK6
Pendeo-navem mei anguillae plena est.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to vectorize/accelerate this?

2011-11-03 Thread William Dunlap
I neglected to give another benefit of putting your algorithms
into functions: you can use the compiler package to compile them,
which can give a big boost in speed.  E.g., I compiled the functions f0,
f1, and f2 that I defined earlier to make new functions f0_c, f1_c,
and f2_c:

 library(compiler)
 f0_c - cmpfun(f0)
 f1_c - cmpfun(f1)
 f2_c - cmpfun(f2)
 system.time(for(i in 1:1000)f0_c(a)) # a is runif(1000, 0, 10)
   user  system elapsed
 18.620   0.000  18.649
 system.time(for(i in 1:1000)f1_c(a))
   user  system elapsed
  1.290   0.000   1.288
 system.time(for(i in 1:1000)f2_c(a))
   user  system elapsed
  0.790   0.000   0.791

Compare those times with the 23, 5.5, and 4.2 seconds
for the non-compiled version.  I haven't used the compiler
package enough to generate any folklore on it, but it certainly
helps in this simple example.

(identical() shows that the output of all these functions are the same.) 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 -Original Message-
 From: William Dunlap
 Sent: Thursday, November 03, 2011 3:23 PM
 To: hihi; r-help
 Subject: RE: [R] Is it possible to vectorize/accelerate this?
 
 You should get familiar with some basic timing tools
 and techniques so you can investigate things like this
 yourself.
 
 system.time is the most basic timing tool.  E.g.,
system.time(for(i in 1:1000)f0(a))
  user  system elapsed
22.920   0.000  22.932
 means it took c. 23 seconds of real time to run f0(a)
 1000 times.
 
 When comparing timing, it makes things easier to define
 a series of functions that implement the various algorithms
 but have the same inputs and outputs.  E.g., for your problem
 
 f0 - function(a_vec) {
 b_vec - a_vec
 for (i in 2:length(b_vec)){
 b_vec[i] - ifelse(abs(b_vec[i-1] + a_vec[i])  1, a_vec[i], 
 b_vec[i-1] + a_vec[i])
 }
 b_vec
 }
 
 f1 - function(a_vec) {
 b_vec - a_vec
 for (i in 2:length(b_vec)){
 b_vec[i] - if(abs(b_vec[i-1] + a_vec[i])  1) a_vec[i] else 
 b_vec[i-1] + a_vec[i]
 }
 b_vec
 }
 
 f2 - function(a_vec) {
 b_vec - a_vec
 for (i in 2:length(b_vec)){
 if(abs(s - b_vec[i-1] + a_vec[i]) = 1) b_vec[i] - s
 }
 b_vec
 }
 
 Then run them with the same dataset:
  a - runif(1000, 0, .3)
  system.time(for(i in 1:1000)f0(a))
user  system elapsed
  22.920   0.000  22.932
  system.time(for(i in 1:1000)f1(a))
user  system elapsed
   5.510   0.000   5.514
  system.time(for(i in 1:1000)f2(a))
user  system elapsed
   4.210   0.000   4.217
 
 (The rbenchmark package's benchmark function encapsulates this idiom.)
 
 It pays to use a dataset similar to the one you will ultimately be using,
 where similar depends on the context.  E.g., the algorithm in f2 is 
 relatively
 faster when the cumsum exceeds 1 most of the time
 
  a - runif(1000, 0, 10)
  system.time(for(i in 1:1000)f0(a))
user  system elapsed
  21.900   0.000  21.912
  system.time(for(i in 1:1000)f1(a))
user  system elapsed
   4.610   0.000   4.609
  system.time(for(i in 1:1000)f2(a))
user  system elapsed
   2.490   0.000   2.494
 
 If you will be working with large datasets, you should look at how the
 time grows as the size of the dataset grows.  If the time looks quadratic 
 between,
 say, length 100 and length 200, don't waste your time testing it for length 
 100.
 For algorithms that work on data.frames (or matrices), the relative speed ofen
 depends on the ratio of the number of rows and the number of columns of data.
 Check that out.  For these sorts of tests it is worthwhile to make a function 
 to
 generate typical looking data of any desired size.
 
 It doesn't take too long to do this once you have the right mindset.  Once you
 do you don't have to rely on folklore like never use loops and instead do
 evidence-based computing.
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
  Behalf Of R. Michael
  Weylandt
  Sent: Thursday, November 03, 2011 2:51 PM
  To: hihi; r-help
  Subject: Re: [R] Is it possible to vectorize/accelerate this?
 
  Yes -- if  else is much faster than ifelse() because if is a
  primitive while ifelse() is a whole function call (in fact, you can
  see the code by typing ifelse into the prompt and see that it has two
  if calls within it.
 
  Michael
 
  On Thu, Nov 3, 2011 at 4:38 PM, hihi v.p.m...@freemail.hu wrote:
   Hi,
   thank you for your very immediate response. :-) Is if than and else faster
   than ifelse? I'm wondering (or not knowing something)
   Best regards,
   Peter
   2011/11/3 R. Michael Weylandt michael.weyla...@gmail.com
   michael.weyla...@gmail.com
  
   I don't immediately see a good trick for vectorization so this seems to 
   me
   to be a good candidate for work in a lower-level language. Staying 
   within R,
   I'd suggest you use if and else rather than ifelse() since your 
   computation