By the way, you were correct in your original response that all I really need is input_transform but it is certainly a worthwhile exercise to see how traits, etc. are used and I appreciate your kind explanations.
On Wed, Jun 16, 2010 at 10:58 AM, Romain Francois <[email protected]> wrote: > Le 16/06/10 17:44, Douglas Bates a écrit : >> >> Thank you very much, Romain. This certainly helps my understanding of >> unary_function, traits, etc. These are the sorts of things I can read >> about but until I see a non-trivial usage it is hard to keep these >> ideas straight. >> >> Two minor questions occur to me at this point. In my code I assigned >> the names to be the names of the input list but I think now that I >> should have assigned a clone of the object names instead. > > I have been lazy (the wrong kind of lazyness) when implementing names<-, and > just used a callback to the R function names<- . The relevant chunk is in > Vector.h in the NamesProxy class: > > void set(SEXP x) const { > SEXP new_vec = PROTECT( internal::try_catch( > Rf_lcons( Rf_install("names<-"), > Rf_cons( parent, Rf_cons( x , > R_NilValue) )))) ; > /* names<- makes a new vector, so we have to change > the SEXP of the parent of this proxy, it might be > worth to work directly with the names attribute > instead > of using the names<- R function, but then we need > to > take care of coercion, recycling, etc ... we > cannot just > brutally assign the names attribute */ > const_cast<Vector&>(parent).setSEXP( new_vec ) ; > UNPROTECT(1) ; /* new_vec */ > } > > So with this implementation, you don't need to worry about copying names. > But it might not be the best implementation. > >> To be safe >> I should have cloned before assigning, right? I also checked the >> length before assigning but you didn't. Am I reflecting too much of >> my C programming background here? In other words, is there some magic >> in the Rcpp names assignment that does the checking? > > The magic comes from R. > >> x <- 1:10 >> names(x) <- c("a", "b") >> x > a b <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> > 1 2 3 4 5 6 7 8 9 10 > > In your example ll.names() is either NULL or a vector of the appropriate > size. I think (not sure though) that names<- is restrictive about the rhs: > >> names(x) <- rnorm > Erreur dans as.character(function (n, mean = 0, sd = 1) : > cannot coerce type 'closure' to vector of type 'character' > > Romain > >> Thanks again. >> >> On Wed, Jun 16, 2010 at 9:55 AM, Romain Francois >> <[email protected]> wrote: >>> >>> Small update incorporating the use of all (and using namespace Rcpp to >>> make >>> things slightly less cluterred). >>> >>> require( Rcpp ) >>> require( inline ) >>> >>> inc<- ' >>> >>> using namespace Rcpp ; >>> >>> template<typename OBJECT, typename FUN> >>> Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype> >>> sapply(const OBJECT& object, FUN fun) { >>> >>> const int RTYPE = traits::r_sexptype_traits< typename >>> FUN::result_type>::rtype ; >>> >>> Vector<RTYPE> ans = Vector<RTYPE>::import_transform( >>> object.begin(), object.end(), fun >>> ) ; >>> return ans ; >>> } >>> >>> template<int RT, typename FUN> >>> Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype> >>> sapply( const Vector<RT>& object, FUN fun){ >>> >>> const int RTYPE = traits::r_sexptype_traits< typename >>> FUN::result_type>::rtype ; >>> >>> Vector<RTYPE> ans = Vector<RTYPE>::import_transform( >>> object.begin(), object.end(), fun >>> ) ; >>> ans.names() = object.names() ; >>> return ans ; >>> } >>> >>> template<typename T> >>> struct square : std::unary_function<T,T> { >>> inline T operator()( const T& t ){ return t*t ; } >>> } ; >>> >>> ' >>> >>> code<- ' >>> NumericVector xx( x) ; >>> return all( sapply( xx, square<double>() )< 15 ) ; >>> >>> ' >>> >>> fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc ) >>> fx( seq(1,10, length = 5 ) ) >>> >>> >>> This needs rev 1555 of Rcpp for the "< 15" part. I'll try to come up >>> with a >>> lazy version of this for the Rcpp sugar. >>> >>> I'll probably recycle some of this thread into a blog post (with nicer >>> formatting). >>> >>> >>> Also worth noting the existence of Rcpp::unary_call, Rcpp::fixed_call and >>> Rcpp::binary_call that wrap up an R function in a strong type illusion >>> (powered by the twins Rcpp::as and Rcpp::wrap). >>> >>> So for example (this will be less efficient for obvious reasons): >>> >>> code<- ' >>> NumericVector xx( x) ; >>> return all( sapply( xx, unary_call<double,double>( >>> Function("square") >>> ) )< 15 ) ; >>> >>> ' >>> >>> square<- function(x) x^2 >>> fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc ) >>> fx( seq(1,10, length = 5 ) ) >>> >>> Romain >>> >>> PS : More examples of these in the runit.Language.R file. >>> >>> Le 16/06/10 16:18, Romain Francois a écrit : >>>> >>>> >>>> Le 16/06/10 14:58, Douglas Bates a écrit : >>>>> >>>>> Ever since I read Phil Spector's book on S I have been a fan of >>>>> functional programming in S and R. When Jose Pinheiro and I were >>>>> working on the nlme package there was a joke between us that you could >>>>> tell which of us wrote which parts of the code because his parts >>>>> always had an object named "aux" and my parts always had >>>>> "unlist(lapply(list(...)))". >>>>> >>>>> Even within C++ I like to use the std:: algorithms like >>>>> std::transform, but, of course, there are differences between a >>>>> strongly typed language like C++ and a dynamically typed language like >>>>> S. Templates can get around these differences to some extent but I am >>>>> still a bit of a novice regarding templates. >>>>> >>>>> Currently I want to apply some functions to lists but entirely within >>>>> the C++ code (i.e. I don't want to create Rcpp Function objects and >>>>> call back to R). For the sake of argument, consider a function that >>>>> extracts the lengths of the components of the lists. >>>>> >>>>> >>>>> library(Rcpp) >>>>> library(inline) >>>>> inc<- ' >>>>> class length { >>>>> public: >>>>> R_len_t operator() (RObject const& x) {return Rf_length(SEXP(x));} >>>>> }; >>>>> ' >>>>> code<- ' >>>>> List lst(ll); >>>>> IntegerVector ans(lst.size()); >>>>> std::transform(lst.begin(), lst.end(), ans.begin(), length()); >>>>> return ans; >>>>> ' >>>>> ltst<- cxxfunction(signature(ll = "list"), code, "Rcpp", inc) >>>>> ll<- list(a = numeric(0), b = LETTERS[6:9], c = c) >>>>> ltst(ll) >>>>> sapply(ll, length) >>>>> >>>>> I would like to create a templated sapply function like >>>>> >>>>> template<int RTYPE> >>>>> Vector<RTYPE> sapply(List ll, ??) { >>>>> Vector<RTYPE> ans(ll.size()); >>>>> CharacterVector nms = ll.names(); >>>>> if (nms.size() == ll.size()) ans.names() = nms; >>>>> std::transform(ll.begin(), ll.end(), ans.begin(), ??); >>>>> return ans; >>>>> } >>>>> >>>>> but I don't know how to specify the second argument that is a function >>>>> that returns the atomic element type of a Vector<RTYPE> (is this as >>>>> simple as Vector<RTYPE>::value_type?) and has a single argument which >>>>> probably should be an RObject (although might be an SEXP, if that was >>>>> more convenient). Can someone (probably Romain) provide some >>>>> guidance? >>>> >>>> I recently added Vector::import_transform which might be all you need. >>>> >>>> NumericVector x = NumericVector::import_transform( >>>> y.begin(), y.end(), f) ; >>>> >>>> >>>> This will work as long as f is of some type acceptable by >>>> std::transform. >>>> >>>> >>>> But let's keep going anyway. One way (I'll write later why this is not >>>> fully satisfactory) is here: >>>> >>>> require( Rcpp ) >>>> require( inline ) >>>> >>>> inc<- ' >>>> >>>> template<typename OBJECT, typename FUN> >>>> SEXP sapply(const OBJECT& object, FUN fun) { >>>> >>>> const int RTYPE = Rcpp::traits::r_sexptype_traits< typename >>>> FUN::result_type>::rtype ; >>>> >>>> Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform( >>>> object.begin(), object.end(), fun >>>> ) ; >>>> return ans ; >>>> } >>>> >>>> template<int RT, typename FUN> >>>> SEXP sapply( const Rcpp::Vector<RT>& object, FUN fun){ >>>> >>>> const int RTYPE = Rcpp::traits::r_sexptype_traits< typename >>>> FUN::result_type>::rtype ; >>>> >>>> Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform( >>>> object.begin(), object.end(), fun >>>> ) ; >>>> ans.names() = object.names() ; >>>> return ans ; >>>> } >>>> >>>> template<typename T> >>>> struct square : std::unary_function<T,T> { >>>> inline T operator()( T& t ){ return t*t ; } >>>> } ; >>>> >>>> ' >>>> >>>> code<- ' >>>> NumericVector xx( x ); >>>> >>>> return sapply( xx, square<double>() ) ; >>>> >>>> ' >>>> >>>> fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc ) >>>> >>>> >>>> > fx( seq(1,10, length = 5 ) ) >>>> [1] 1.0000 10.5625 30.2500 60.0625 100.0000 >>>> >>>> I'll try to explain bit by bit. Please let me know if something is too >>>> cryptic. The good thing about templates is that as long as they work >>>> they don't complain. The bad thing is that when they start complaining, >>>> you need large screens to accomodate all the warnings/erros you get... >>>> >>>> >>>> >>>> Starting from the last thing, the square struct. >>>> >>>> template<typename T> >>>> struct square : std::unary_function<T,T> { >>>> inline T operator()( const T& t ){ return t*t ; } >>>> } ; >>>> >>>> we need the function that goes into sapply to help, i.e we need it to be >>>> aware of the result type. inheriting from std::unary_function makes this >>>> easy. It essentially just adds some typedef. See >>>> http://www.cplusplus.com/reference/std/functional/unary_function/ >>>> >>>> So when we do square<double>(), the function is self aware that it >>>> returns a double. We need that to decide which kind of R vectors we want >>>> to create. Which is the job of the Rcpp::traits::r_sexptype_traits >>>> trait. So: >>>> >>>> const int RTYPE = Rcpp::traits::r_sexptype_traits< typename >>>> FUN::result_type>::rtype ; >>>> >>>> it is very important that this is const, because we are using this as a >>>> template argument later and this needs to be compile time constant. >>>> >>>> Then, we just delegate to std::transform through >>>> Vector::import_transform: >>>> >>>> Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform( >>>> object.begin(), object.end(), fun >>>> ) ; >>>> >>>> >>>> >>>> For now the sapply just return a SEXP because that is easy, but we could >>>> have the result type deduced as well (fasten your seatbelt): >>>> >>>> template<typename OBJECT, typename FUN> >>>> Rcpp::Vector< Rcpp::traits::r_sexptype_traits< typename >>>> FUN::result_type>::rtype> >>>> sapply(const OBJECT& object, FUN fun) { >>>> >>>> >>>> Anyway, why is there two sapply ? because the first one is more generic, >>>> you could sapply on a std::vector<double> for example, the only thing >>>> we >>>> do with the object is to call begin() and end(), so as long as they >>>> exist and they produce something that makes sense for std::transform, we >>>> are fine (this is why templates are just so great). >>>> >>>> The second version is specific to Rcpp vectors, on which we can call >>>> names(). >>>> >>>> >>>> >>>> Now why is not satisfactory ? This is not lazy enough for me. With a >>>> little bit more work, instead of returning a Rcpp::Vector which >>>> allocates all of its memory right now (not lazy), we could have sapply >>>> returning some sort of proxy object. The proxy object would expose a >>>> similar interface : operator[] and iterator, but the result will only be >>>> calculated when truly necessary. >>>> >>>> Why should we care about that ? Imagine, instead of just calculating the >>>> square, we want to find out if all the squared values are below 15, ie. >>>> the equivalent of the R code "all( x^2< 15 )" ? We only need to >>>> calculate three values to make our decision, so why bother calculating >>>> the 2 last ones. (lazy) >>>> >>>> This is the kind of thing I want to bring to Rcpp with the "sugar" piece >>>> of the puzzle. >>>> >>>> We would then have this expression: >>>> >>>> code<- ' >>>> Rcpp::NumericVector xx( x) ; >>>> >>>> return Rcpp::all( sapply( xx, square<double>() )< 15 ) ; >>>> >>>> ' >>>> >>>> It almost work right now (I need to add some code so that<15 makes >>>> sense, I only added Vector< Vector so far), but currently requires to >>>> compute all the 5 values before handling the data to all. >>>> >>>> >>>> Please come back with questions if some gaps need to be filled. >>>> >>>> Romain > > -- > Romain Francois > Professional R Enthusiast > +33(0) 6 28 91 30 30 > http://romainfrancois.blog.free.fr > |- http://bit.ly/98Uf7u : Rcpp 0.8.1 > |- http://bit.ly/c6YnCi : graph gallery collage > `- http://bit.ly/bZ7ltC : inline 0.3.5 > > _______________________________________________ Rcpp-devel mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
