Moving the call outside the main loop would be effective for some
scenarios (i.e, the scenarios where the data objects do not contain
NaNs). However, once they do we still want to compute a distance based
on the values and "correct" for the NaNs in some way, so skipping the
entire object is not really an option. Including a switch between the
cases of objects with and objects without NaNs is probably something
worthwhile (that and using more rcpp-sugar).
Nevertheless, the question still remains why the rcpp isNaN call is so
much slower.
On 12/13/2016 2:04 PM, xian at unm.edu (Christian Gunning) wrote:
| for (i = 0; i < numObjects; i++) {
| for (j = 0; j < numCodes; j++) {
| dist = 0;
| for (k = 0; k < numVars; k++) {
| if (!ISNAN(data[i * numVars + k])) {
| tmp = data[i * numVars + k] - codes[j * numVars + k];
Why not drop data and codes and use sData1(i,k) - sData2(j,k) ?
Or better yet, just use the original code with NumericMatrix:
sData1[i * numVars + k] does the right thing.
I don't get any timing difference based on this change.
Using Rcpp sugar
(https://cran.r-project.org/package=Rcpp/vignettes/Rcpp-sugar.pdf),
and moving the call outside the loop, appears to do the right thing.
## modified example
## see edits here:
https://github.com/helmingstay/rcpp-timings/blob/master/diff/rcppdist.cpp#L24
git clone https://github.com/helmingstay/rcpp-timings
cd rcpp-timings/diff
R --vanilla < glue.R
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel