Hey Martin,

Thanks for the reply, that is a much better way!!!

On another note, I actually meant to post this on the rcpp-devel list (and
have forwarded it there, noting that there was a mistake in the original
post), however I'm quite glad I posted it here by mistake to see this
answer :)

Cheers, Jeff

On Wed, Dec 14, 2011 at 12:52 AM, Martin Morgan <mtmor...@fhcrc.org> wrote:

> On 12/13/2011 03:48 PM, Jeffrey Pollock wrote:
>
>> Hello all,
>>
>> I've been working on a package to do various things related to the
>> Conway-Maxwell-Poisson distribution and wanted to be able to make fast
>> random draws from the distribution. My R code was running quite slow so I
>> decided to give Rcpp a bash. I had used it before but only for extremely
>> basic stuff and always using inline. This time I decided to give making a
>> proper package a go.
>>
>> First of all I should say that this was incredibly easy due to
>> Rcpp.package.skeleton() and the countless answers to quesions online and
>> documentation!
>>
>> Secondly, I'm worried that my speedup has been so massive (over 500x !!!)
>> that I think I've made a mistake, hence my post here.
>>
>> Here is all my code, if someone has a minute to point out anything wrong
>> (or even if its correct and there is room for improvement, im pretty new
>> to
>> this) it would be much appreciated. I've had a simple look at the results
>> and they look fine, but seriously, 500x faster?!
>>
>> function in R;
>> library(compiler)
>>
>> Rrcomp<- cmpfun(
>>         function(n, lam, nu, max = 100L) {
>>             ans<- integer(n)
>>             dist<- dcomp(0:max, lam, nu, max)
>>             u<- runif(n)
>>             for (i in 1:n) {
>>                 count<- 0L
>>                 pr<- dist[1L]
>>                 while (pr<  u[i]) {
>>                     count<- count + 1L
>>                     pr<- pr + dist[count + 1L]
>>                 }
>>                 ans[i]<- count
>>             }
>>             return(ans)
>>         }
>> )
>>
>
> Hi Jeff
>
> Not really what you're asking about, but looks like you're sampling with
> replacement from the sequence 0:(dist-1) n times with probability dist, so
>
> Rrcomp.1 <-
>    function(n, lam, nu, max = 100L)
>
> {
>    dist <- dcomp(0:max, lam, nu, max)
>    sample(seq_along(dist) - 1L, n, TRUE, prob=dist)
> }
>
> and
>
> > system.time(res <- table(Rrcomp(n, lam, nu))); res
>   user  system elapsed
>  1.493   0.000   1.495
>
>   0    1    2    3    4    5    6    7    8    9   10   11   12   13
>  355 1656 4070 6976 8745 8861 7275 5214 3357 1892  926  399  165   69
>  14   15   16   17
>  24   11    2    3
> > system.time(res <- table(Rrcomp.1(n, lam, nu))); res
>   user  system elapsed
>  0.029   0.000   0.028
>
>   0    1    2    3    4    5    6    7    8    9   10   11   12   13
>  333 1754 4096 6876 8964 8799 7399 5030 3215 1877  951  432  184   61
>  14   15   16
>  23    5    1
>
> Martin
>
>  dcomp<- function(y, lam, nu, max = 100L) {
>>     Z<- function(lam, nu, max) {
>>         sum<- 0L
>>         for(i in 0L:max) {
>>             sum<- sum + lam^i / factorial(i)^nu
>>         }
>>         return(sum)
>>     }
>>     return(lam^y / (factorial(y)^nu * Z(lam, nu, max)))
>> }
>>
>> function in Rcpp;
>> header file;
>>
>> #include<Rcpp.h>
>>
>> RcppExport SEXP rcomp(SEXP n_, SEXP dist_);
>>
>> source file;
>>
>> #include "rcomp.h"
>>
>> SEXP rcomp(SEXP n_, SEXP dist_) {
>>     using namespace Rcpp ;
>>
>>     int n = as<int>(n_);
>>     NumericVector dist(dist_);
>>
>>     NumericVector ans(n);
>>     int count;
>>     double pr;
>>     RNGScope scope;
>>     NumericVector u = runif(n);
>>
>>     for (int i = 0; i<  n; ++i) {
>>         count = 0;
>>         pr = dist[0];
>>         while (pr<  u[i]) {
>>             count++;
>>             pr += dist[count];
>>         }
>>         ans[i] = count;
>>     }
>>     return ans;
>> }
>>
>> R call;
>>
>> rcomp<- function(n, lam, nu, max = 100){
>>     dist<- dcomp(0:max, lam, nu, max)
>>     .Call("rcomp", n = n, dist = dist, PACKAGE = "glmcomp")
>> }
>>
>> Here are some results;
>>
>>> n<- 50000
>>> lam<- 5
>>> nu<- 1
>>> rbind(table(rcomp(n, lam, nu))[1:10] / n, table(Rrcomp(n, lam, nu))[1:10]
>>>
>> / n, dpois(0:9, lam))
>>                0          1          2         3         4
>> 5         6
>> [1,] 0.006440000 0.03124000 0.08452000 0.1392200 0.1747800 0.1755200
>> 0.1490000
>> [2,] 0.006660000 0.03232000 0.08412000 0.1425400 0.1779600 0.1748400
>> 0.1445600
>> [3,] 0.006737947 0.03368973 0.08422434 0.1403739 0.1754674 0.1754674
>> 0.1462228
>>              7          8          9
>> [1,] 0.1063000 0.06538000 0.03534000
>> [2,] 0.1039800 0.06492000 0.03624000
>> [3,] 0.1044449 0.06527804 0.03626558
>>
>> (for nu = 1 the com-poisson distribution is the same as normal poisson)
>>
>>  benchmark(rcomp(n, lam, nu), Rrcomp(n, lam, nu), replications = 10, order
>>>
>> = "relative")
>>                 test replications elapsed relative user.self sys.self
>> 2 Rrcomp(n, lam, nu)           10    2.03   1.0000      1.96     0.00
>> 1  rcomp(n, lam, nu)           10 1172.51 577.5911   1164.50     0.08
>>
>> Thanks in advance if anyone has any time to have a look at this :)
>>
>> Jeff
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________**________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-devel<https://stat.ethz.ch/mailman/listinfo/r-devel>
>>
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to