Hello,
Storing the data frame as a vector<tuple<...>> feels very inefficient,
in essence you are copying all the data to another structure, which is
not much easier to use anyway. The fun implementation feels boiler plate :
int fun(MyRow row) {
return
boost::get<0>(row)+2*boost::get<1>(row)+3*boost::get<2>(row)+4*boost::get<3>(row);
}
The version I proposed is not restricted to 4 columns and will be more
efficient since it does not need to copy all the data. It just stores
one line at a time and processes it.
Now on variadic templates, yes they can definitely help. In Rcpp11 I'm
using them extensively and it allowed me to reduce the code size
dramatically (Rcpp11 is about 40% the size of Rcpp).
See for example :
https://github.com/romainfrancois/Rcpp11/blob/master/inst/include/Rcpp/sugar/functions/replicate.h
This is used to implement this feature:
double fun( double x, double y, int z ){
return x + y + z ;
}
NumericVector x = replicate( 10, call( fun, 1.0, 2.0, 3 ) ) ;
Another example is this 75 file:
https://github.com/romainfrancois/Rcpp11/blob/master/inst/include/Rcpp/module/FunctionInvoker.h
which replaces a file that weights about 14666 lines in Rcpp.
Romain
Le 27/09/13 16:12, Mark Clements a écrit :
This can be done more generally.
Following an earlier suggestion from Romain, we can use boost::tuple from the
BH package - for a row of fixed size with general types. Then we can use a
template to read in the data-frame and work with the set of rows.
Variadic templates would be nice here, rather than needing to enumerate for
tuples of different lengths.
Out of interest, is this poor style for Rcpp?
Sincerely, Mark.
require(inline)
testReadDf <-
rcpp(signature(df="data.frame"),
includes="
#include <boost/tuple/tuple.hpp>
#include <vector>
#include <algorithm>
// general function to read a data-frame
template <class T1, class T2, class T3, class T4>
std::vector<boost::tuple<T1,T2,T3,T4> > read_df( DataFrame df ){
typedef boost::tuple<T1,T2,T3,T4> Row;
int n = df.nrows() ;
std::vector<Row> rows(n) ;
Vector<traits::r_sexptype_traits<T1>::rtype> df0 = df[0];
Vector<traits::r_sexptype_traits<T2>::rtype> df1 = df[1];
Vector<traits::r_sexptype_traits<T3>::rtype> df2 = df[2];
Vector<traits::r_sexptype_traits<T4>::rtype> df3 = df[3];
for( int i=0; i<n; i++)
rows[i] = Row(df0[i],df1[i],df2[i],df3[i]);
return rows ;
}
// example function
typedef boost::tuple<int,int,int,int> MyRow;
int fun(MyRow row) {
return
boost::get<0>(row)+2*boost::get<1>(row)+3*boost::get<2>(row)+4*boost::get<3>(row);
}
",
body="
// read in the data-frame as a vector of rows
std::vector<MyRow> v = read_df<int,int,int,int>(df);
int n = v.size();
std::vector<int> out(n);
std::transform(v.begin(),v.end(),out.begin(),fun);
return wrap(out);
")
testReadDf(data.frame(1,2,3,4))
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
_______________________________________________
Rcpp-devel mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel