Thank you very much, Romain. This certainly helps my understanding of unary_function, traits, etc. These are the sorts of things I can read about but until I see a non-trivial usage it is hard to keep these ideas straight.
Two minor questions occur to me at this point. In my code I assigned the names to be the names of the input list but I think now that I should have assigned a clone of the object names instead. To be safe I should have cloned before assigning, right? I also checked the length before assigning but you didn't. Am I reflecting too much of my C programming background here? In other words, is there some magic in the Rcpp names assignment that does the checking? Thanks again. On Wed, Jun 16, 2010 at 9:55 AM, Romain Francois <[email protected]> wrote: > > Small update incorporating the use of all (and using namespace Rcpp to make > things slightly less cluterred). > > require( Rcpp ) > require( inline ) > > inc <- ' > > using namespace Rcpp ; > > template <typename OBJECT, typename FUN> > Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype > > sapply(const OBJECT& object, FUN fun) { > > const int RTYPE = traits::r_sexptype_traits< typename > FUN::result_type>::rtype ; > > Vector<RTYPE> ans = Vector<RTYPE>::import_transform( > object.begin(), object.end(), fun > ) ; > return ans ; > } > > template <int RT, typename FUN> > Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype > > sapply( const Vector<RT>& object, FUN fun){ > > const int RTYPE = traits::r_sexptype_traits< typename > FUN::result_type>::rtype ; > > Vector<RTYPE> ans = Vector<RTYPE>::import_transform( > object.begin(), object.end(), fun > ) ; > ans.names() = object.names() ; > return ans ; > } > > template <typename T> > struct square : std::unary_function<T,T> { > inline T operator()( const T& t ){ return t*t ; } > } ; > > ' > > code <- ' > NumericVector xx( x) ; > return all( sapply( xx, square<double>() ) < 15 ) ; > > ' > > fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc ) > fx( seq(1,10, length = 5 ) ) > > > This needs rev 1555 of Rcpp for the "< 15" part. I'll try to come up with a > lazy version of this for the Rcpp sugar. > > I'll probably recycle some of this thread into a blog post (with nicer > formatting). > > > Also worth noting the existence of Rcpp::unary_call, Rcpp::fixed_call and > Rcpp::binary_call that wrap up an R function in a strong type illusion > (powered by the twins Rcpp::as and Rcpp::wrap). > > So for example (this will be less efficient for obvious reasons): > > code <- ' > NumericVector xx( x) ; > return all( sapply( xx, unary_call<double,double>( Function("square") > ) ) < 15 ) ; > > ' > > square <- function(x) x^2 > fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc ) > fx( seq(1,10, length = 5 ) ) > > Romain > > PS : More examples of these in the runit.Language.R file. > > Le 16/06/10 16:18, Romain Francois a écrit : >> >> >> Le 16/06/10 14:58, Douglas Bates a écrit : >>> >>> Ever since I read Phil Spector's book on S I have been a fan of >>> functional programming in S and R. When Jose Pinheiro and I were >>> working on the nlme package there was a joke between us that you could >>> tell which of us wrote which parts of the code because his parts >>> always had an object named "aux" and my parts always had >>> "unlist(lapply(list(...)))". >>> >>> Even within C++ I like to use the std:: algorithms like >>> std::transform, but, of course, there are differences between a >>> strongly typed language like C++ and a dynamically typed language like >>> S. Templates can get around these differences to some extent but I am >>> still a bit of a novice regarding templates. >>> >>> Currently I want to apply some functions to lists but entirely within >>> the C++ code (i.e. I don't want to create Rcpp Function objects and >>> call back to R). For the sake of argument, consider a function that >>> extracts the lengths of the components of the lists. >>> >>> >>> library(Rcpp) >>> library(inline) >>> inc<- ' >>> class length { >>> public: >>> R_len_t operator() (RObject const& x) {return Rf_length(SEXP(x));} >>> }; >>> ' >>> code<- ' >>> List lst(ll); >>> IntegerVector ans(lst.size()); >>> std::transform(lst.begin(), lst.end(), ans.begin(), length()); >>> return ans; >>> ' >>> ltst<- cxxfunction(signature(ll = "list"), code, "Rcpp", inc) >>> ll<- list(a = numeric(0), b = LETTERS[6:9], c = c) >>> ltst(ll) >>> sapply(ll, length) >>> >>> I would like to create a templated sapply function like >>> >>> template<int RTYPE> >>> Vector<RTYPE> sapply(List ll, ??) { >>> Vector<RTYPE> ans(ll.size()); >>> CharacterVector nms = ll.names(); >>> if (nms.size() == ll.size()) ans.names() = nms; >>> std::transform(ll.begin(), ll.end(), ans.begin(), ??); >>> return ans; >>> } >>> >>> but I don't know how to specify the second argument that is a function >>> that returns the atomic element type of a Vector<RTYPE> (is this as >>> simple as Vector<RTYPE>::value_type?) and has a single argument which >>> probably should be an RObject (although might be an SEXP, if that was >>> more convenient). Can someone (probably Romain) provide some >>> guidance? >> >> I recently added Vector::import_transform which might be all you need. >> >> NumericVector x = NumericVector::import_transform( >> y.begin(), y.end(), f) ; >> >> >> This will work as long as f is of some type acceptable by std::transform. >> >> >> But let's keep going anyway. One way (I'll write later why this is not >> fully satisfactory) is here: >> >> require( Rcpp ) >> require( inline ) >> >> inc <- ' >> >> template <typename OBJECT, typename FUN> >> SEXP sapply(const OBJECT& object, FUN fun) { >> >> const int RTYPE = Rcpp::traits::r_sexptype_traits< typename >> FUN::result_type>::rtype ; >> >> Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform( >> object.begin(), object.end(), fun >> ) ; >> return ans ; >> } >> >> template <int RT, typename FUN> >> SEXP sapply( const Rcpp::Vector<RT>& object, FUN fun){ >> >> const int RTYPE = Rcpp::traits::r_sexptype_traits< typename >> FUN::result_type>::rtype ; >> >> Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform( >> object.begin(), object.end(), fun >> ) ; >> ans.names() = object.names() ; >> return ans ; >> } >> >> template <typename T> >> struct square : std::unary_function<T,T> { >> inline T operator()( T& t ){ return t*t ; } >> } ; >> >> ' >> >> code <- ' >> NumericVector xx( x ); >> >> return sapply( xx, square<double>() ) ; >> >> ' >> >> fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc ) >> >> >> > fx( seq(1,10, length = 5 ) ) >> [1] 1.0000 10.5625 30.2500 60.0625 100.0000 >> >> I'll try to explain bit by bit. Please let me know if something is too >> cryptic. The good thing about templates is that as long as they work >> they don't complain. The bad thing is that when they start complaining, >> you need large screens to accomodate all the warnings/erros you get... >> >> >> >> Starting from the last thing, the square struct. >> >> template <typename T> >> struct square : std::unary_function<T,T> { >> inline T operator()( const T& t ){ return t*t ; } >> } ; >> >> we need the function that goes into sapply to help, i.e we need it to be >> aware of the result type. inheriting from std::unary_function makes this >> easy. It essentially just adds some typedef. See >> http://www.cplusplus.com/reference/std/functional/unary_function/ >> >> So when we do square<double>(), the function is self aware that it >> returns a double. We need that to decide which kind of R vectors we want >> to create. Which is the job of the Rcpp::traits::r_sexptype_traits >> trait. So: >> >> const int RTYPE = Rcpp::traits::r_sexptype_traits< typename >> FUN::result_type>::rtype ; >> >> it is very important that this is const, because we are using this as a >> template argument later and this needs to be compile time constant. >> >> Then, we just delegate to std::transform through Vector::import_transform: >> >> Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform( >> object.begin(), object.end(), fun >> ) ; >> >> >> >> For now the sapply just return a SEXP because that is easy, but we could >> have the result type deduced as well (fasten your seatbelt): >> >> template <typename OBJECT, typename FUN> >> Rcpp::Vector< Rcpp::traits::r_sexptype_traits< typename >> FUN::result_type>::rtype > >> sapply(const OBJECT& object, FUN fun) { >> >> >> Anyway, why is there two sapply ? because the first one is more generic, >> you could sapply on a std::vector<double> for example, the only thing we >> do with the object is to call begin() and end(), so as long as they >> exist and they produce something that makes sense for std::transform, we >> are fine (this is why templates are just so great). >> >> The second version is specific to Rcpp vectors, on which we can call >> names(). >> >> >> >> Now why is not satisfactory ? This is not lazy enough for me. With a >> little bit more work, instead of returning a Rcpp::Vector which >> allocates all of its memory right now (not lazy), we could have sapply >> returning some sort of proxy object. The proxy object would expose a >> similar interface : operator[] and iterator, but the result will only be >> calculated when truly necessary. >> >> Why should we care about that ? Imagine, instead of just calculating the >> square, we want to find out if all the squared values are below 15, ie. >> the equivalent of the R code "all( x^2 < 15 )" ? We only need to >> calculate three values to make our decision, so why bother calculating >> the 2 last ones. (lazy) >> >> This is the kind of thing I want to bring to Rcpp with the "sugar" piece >> of the puzzle. >> >> We would then have this expression: >> >> code <- ' >> Rcpp::NumericVector xx( x) ; >> >> return Rcpp::all( sapply( xx, square<double>() ) < 15 ) ; >> >> ' >> >> It almost work right now (I need to add some code so that <15 makes >> sense, I only added Vector < Vector so far), but currently requires to >> compute all the 5 values before handling the data to all. >> >> >> Please come back with questions if some gaps need to be filled. >> >> Romain >> >> > > > -- > Romain Francois > Professional R Enthusiast > +33(0) 6 28 91 30 30 > http://romainfrancois.blog.free.fr > |- http://bit.ly/98Uf7u : Rcpp 0.8.1 > |- http://bit.ly/c6YnCi : graph gallery collage > `- http://bit.ly/bZ7ltC : inline 0.3.5 > > > _______________________________________________ Rcpp-devel mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
