Hi there, I am new to RcppArmadillo and very excited about the sparse matrix functionality it offers. I have a large optim that I wish to break down into stochastic descent in Rcpp to speed things up.
To start small, I used a simple logisitic regression function that I have working in R - Norm <- function(w1, w2){ sqrt(sum((w2 - w1)^2)) } xentropy <- function(x, y, w) { - ((y * x) / (1 + exp(y * t(w) %*% x)) ) } stochDesc <- function(param, eta = 0.01, maxit = 10, reltol = .01, x, y) { # Initialize w = param iter = 0 rel_err = reltol + 1 x_dash = cbind(rep(1, nrow(x)), x) # Add bias while(iter < maxit & rel_err > reltol ) { iter = iter + 1 prev_w <- w for(curidx in sample(nrow(x))) { # Random shuffle grad <- xentropy(x_dash[curidx, ], y[curidx], w) w <- w - eta * grad # Update weights } rel_err = Norm(w, prev_w) # could also use: # rel_err = base::norm(matrix(prev_w - w, nrow = 1), type = "F") cat(paste("Epoch ", iter, " Relative cost:", rel_err, "\n")) } list(w = w, iter = iter) } This does every operation that my more complex gradient descent will do, but it is pretty straightforward so I started converting it to Rcpp - #include <RcppArmadillo.h> #include <iostream> using namespace std; using namespace Rcpp ; // [[Rcpp::depends(RcppArmadillo)]] double Norm(NumericVector w1, NumericVector w2) { // Compute the Forbius Norm double op = pow(sum(pow(w2 - w1, 2)), .5); return(op); } NumericVector xentropy(NumericVector x, NumericVector y, NumericVector w) { // Return the cross entropy // NumericVector op = - ((y * x) / (1 + exp(y * trans(w) %*% x)) ); // How to transpose? NumericVector op = w; // Not correct return (op); } // [[Rcpp::export]] NumericVector stochDescCpp(NumericVector param, NumericMatrix x, NumericVector y, double eta = 0.01, int maxit = 10, double reltol = .01) { // Run stochastic Descent to compute logistic regression // Initialize NumericVector w (param); NumericVector prev_w (w); int epoch = 0; NumericVector grad (param.length()); double rel_err = reltol + 1; // Don't know how to add bias, will do that in R before passing it here. //NumericMatrix x_dash(x.nrow(), x.ncol() + 1, rep(1, x.nrow()), x.begin()); // Loop through for each epoch while(epoch < maxit && rel_err > reltol) { epoch++; prev_w = w; for(int curidx = 0; curidx < x.nrow(); ++curidx) { // How to random shuffle? grad = xentropy(x(curidx, _), y[curidx], w); w = w - eta * grad; // Update weights } rel_err = Norm(w, prev_w); cout << "Epoch " << epoch << " Relative cost:" << rel_err << endl; } return(w); } I am running into many issues. Here they are ranked in order of importance) - 1) When I run this (incorrect) program - stochDescCpp(param = rep(0,3), x = X, y = Y, maxit = 100) I get - Error in .Primitive(".Call")(<pointer: 0x0000000069d42560>, param, x, : negative length vectors are not allowed All the inputs are valid and work fine with the R version. So I don't know where this is coming from? I put a bunch of couts and I know that it is happening inside the for loop but don't know why. In general, what is the best way to debug code written this way? I am using RStudio. 2) I don't know how to transpose the matrix in rCpp. I read that I could that using rCppArmadillo, but then I don't know how to convert from NumericMatrix to arma matrix. I don't want to copy as this operation will happen 1000x of times, so this has to be fast. 3) I don't know how to randomly shuffle the rows. I tried RcppArmadillo::sample but haven't gotten the output to work. E.g. do I access elements as x(rowOrd[curIdx],_)? 4) (minor issue) How do I add the bias unit? In other words, add a new column to the input matrix filled with all 1s. Do I have to create a new matrix, copy it column by column? this is not a dealbreaker since I can do this in R before I send in the inputs. Sorry if my questions are too basic and don't seem well researched. I am trying to go for a practical example. Any help will be greatly appreciated. Sincerely, Saurabh
_______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel