Sorry, guys. I'm not active on the listserve, so my last post was held by the 
moderator until after Dirk's solution was posted.

Excellent stuff.

thanks,
kevin

On Jan 29, 2012, at 8:37 AM, R. Michael Weylandt wrote:

> Have you not followed your own thread? Dirk is Mr. Rcpp himself and he
> gives an implementation that gives you 25x improvement here as well as
> tips for getting even more out of it:
> 
> http://tolstoy.newcastle.edu.au/R/e17/help/12/01/2471.html
> 
> Michael
> 
> On Sat, Jan 28, 2012 at 12:28 PM, Kevin Ummel <kevinum...@gmail.com> wrote:
>> Thanks. I've played around with pure R solutions. The fastest re-write of 
>> diff (for the 1 lag case) I can seem to find is this:
>> 
>> diff2 = function(x) {
>>  y = c(x,NA) - c(NA,x)
>>  y[2:length(x)]
>> }
>> 
>> #Compiling via 'cmpfun' doesn't seem to help (or hurt):
>> require(compiler)
>> diff2 = cmpfun(diff2)
>> 
>> But that only gets ~10% improvement over default 'diff' on my machine. Still 
>> too slow for my particular application.
>> 
>> I'm inclined towards Michael's suggestion of inline+Rcpp (or some other use 
>> of C under the hood).
>> 
>> Could someone show me how to go about doing that?
>> 
>> Thanks!
>> Kevin
>> 
>> On Jan 28, 2012, at 9:14 AM, Peter Langfelder wrote:
>> 
>>> ehm... this doesn't take very many ideas.
>>> 
>>> 
>>> x = runif(n=10e6, min=0, max=1000)
>>> x = round(x)
>>> 
>>> system.time( {
>>>  y = x[-1] - x[-length(x)]
>>> })
>>> 
>>> I get about 0.5 seconds on my old laptop.
>>> 
>>> HTH
>>> 
>>> Peter
>>> 
>>> 
>>> On Fri, Jan 27, 2012 at 4:15 PM, Kevin Ummel <kevinum...@gmail.com> wrote:
>>>> Hi everyone,
>>>> 
>>>> Speed is the key here.
>>>> 
>>>> I need to find the difference between a vector and its one-period lag 
>>>> (i.e. the difference between each value and the subsequent one in the 
>>>> vector). Let's say the vector contains 10 million random integers between 
>>>> 0 and 1,000. The solution vector will have 9,999,999 values, since their 
>>>> is no lag for the 1st observation.
>>>> 
>>>> In R we have:
>>>> 
>>>> #Set up input vector
>>>> x = runif(n=10e6, min=0, max=1000)
>>>> x = round(x)
>>>> 
>>>> #Find one-period difference
>>>> y = diff(x)
>>>> 
>>>> Question is: How can I get the 'diff(x)' part as fast as absolutely 
>>>> possible? I queried some colleagues who work with other languages, and 
>>>> they provided equivalent solutions in Python and Clojure that, on their 
>>>> machines, appear to be potentially much faster (I've put the code below in 
>>>> case anyone is interested). However, they mentioned that the overhead in 
>>>> passing the data between languages could kill any improvements. I don't 
>>>> have much experience integrating other languages, so I'm hoping the 
>>>> community has some ideas about how to approach this particular problem...
>>>> 
>>>> Many thanks,
>>>> Kevin
>>>> 
>>>> In iPython:
>>>> 
>>>> In [3]: import numpy as np
>>>> In [4]: arr = np.random.randint(0, 1000, (10000000,1)).astype("int16")
>>>> In [5]: arr1 = arr[1:].view()
>>>> In [6]: timeit arr2 = arr1 - arr[:-1]
>>>> 10 loops, best of 3: 20.1 ms per loop
>>>> 
>>>> In Clojure:
>>>> 
>>>> (defn subtract-lag
>>>>  [n]
>>>>  (let [v (take n (repeatedly rand))]
>>>>    (time (dorun (map - v (cons 0 v))))))
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>        [[alternative HTML version deleted]]
>>>> 
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to