Hi Dirk,
 
thank you for your answer.
 
Just to make sure, you think it should be fine to work with the current R 
version I already have and improve bottlenecks with Rcpp? How would you 
organize input/output and state of the chain? Would you switch to reference 
classes for this or can I keep the S4 classes I already have?
 
Cheers,
Eva
 
 

Gesendet: Freitag, 10. April 2020 um 14:03 Uhr
Von: "Dirk Eddelbuettel" <e...@debian.org>
An: "eva lehner" <eva-vale...@gmx.at>
Cc: "Dirk Eddelbuettel" <e...@debian.org>, 
rcpp-devel@lists.r-forge.r-project.org
Betreff: Aw: Re: [Rcpp-devel] Input on code design implementing a MCMC 
algorithm in R and improving performance with RcppArmadillo
Hi Eva,

Thanks for resending in text, and providing all the detail.

On 10 April 2020 at 11:21, eva lehner wrote:
| I want to „translate“ a MCMC algorithm written in matlab to R to eventually 
release it in a package. In the matlab version model input, state of the chain 
and chain output is stored in structs. Updating different parameters of the 
underlying model is performed with different functions. I already have a very 
raw R version where these structs are implemented as S4 classes and updates are 
performed with generic functions taking both classes as input. I would like to 
increase performance with RcppArmadillo. However, I struggle to choose a good 
approach since I am not very experienced in any of the above subjects and have 
difficulties putting together the different concepts. It would be very helpful 
if you could give me your thoughts on what I think my main options are, tell me 
in case I misunderstood some concepts or point me into a better direction.
|  
| Option 1: mainly R
| Keeping R S4 classes (or rewrite them as Reference Classes since I think they 
would have been a better choice for defining a class storing states of a model) 
and the algorithm written in R and write updating functions in RcppArmadillo. 
These functions can be exposed to R with Rcpp modules or attributes.
| + possibly the easiest version since I only have to rewrite the functions in 
Rcpp
| - Still a lot of R which could come with performance costs
|  
| Option2: mainly Rcpp
| Define one Cpp class which stores input and updated parameters as member 
variables and the updating functions as member functions. Then I could write 
the main MCMC algorithm as a separate function that creates an instance of this 
class, calls the updating functions and returns a list containing the output. I 
could only expose this main algorithm (with Rcpp modules or attributes) to R 
and avoid exposing the internal classes. (Currently I don’t think exposure of 
the classes is necessary and seems more difficult than exposing functions).The 
part involving the class is taken from the Kalman filter example 
(RcppArmadillo-intro vignette) and I imagine the structure of the algorithm 
function and returned list similar to the gallery entry on the EM Algorithm for 
Probit Regressions 
(https://gallery.rcpp.org/articles/EM-algorithm-example/#attempt-1[https://deref-gmx.net/mail/client/Q6JdcN6Rp84/dereferrer/?redirectUrl=https%3A%2F%2Fgallery.rcpp.org%2Farticles%2FEM-algorithm-example%2F%23attempt-1[https://deref-gmx.net/mail/client/Q6JdcN6Rp84/dereferrer/?redirectUrl=https%3A%2F%2Fgallery.rcpp.org%2Farticles%2FEM-algorithm-example%2F%23attempt-1]]).
| + probably faster
| + possibly cleaner
| - more difficult and I have very little Cpp experience (especially with 
classes)
|  
| Personally, I aim for a compromise between performance, implementation 
difficulty and the best approach since this is for my master’s thesis and I 
also have to consider the deadline. I think this compromise would be to stay 
with the main R version but rewrite the S4 classes as Reference Class.
| I'm looking forward to hearing your feedback, thoughts and experiences,

There is a lot to unpack here, and it is difficult to recommend something in
generalities. You also seem already pretty familiar with R, Rcpp and
RcppArmadillo which is good :)

As for Matlab to Armadillo, I have had pretty good experiences translating
Matlab code to RcppArmadillo on more than occassion. What you describe above
all sounds reasonable (also in terms of deadline).

Maybe a pragmatic approach is simplest: do some timings, find some
bottlenecks and replace this with Rcpp? Or easier still (but less
measurement-guided) start with the innermost loops and rethink / reimplement
those?

It is a little tricky to recommend anything more concrete at this stage.

Cheers, Dirk

--
http://dirk.eddelbuettel.com[http://dirk.eddelbuettel.com] | @eddelbuettel | 
e...@debian.org
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Reply via email to