Hi,

Sorry to get back to this, but I can't figure out how to solve it. Do not hesitate to direct me to another forum if this one is not appropriate. I don't think there's anything wrong with Rcpp, but I might not proceed with the loop in Rcpp the right way.

So I tried various regressions in the Rcpp for loop (fastglm, Rfast::glm_poisson, even fastlm; see code at end of email) and they all output the same saved matrix, but a different one than without including a regression step. Compared to a pure R code, the R code with or without a regression output the same matrix as the Rcpp loop without a regression (and therefore different from the ones with a regression), i.e:

- R loop without regression, R loop with regression (glm or fastglm), Rcpp loop without regression = all.equal TRUE.

- Rcpp loop with regression (fastglm, Rfast, or fastlm) = all.equal TRUE but different from above.

I'm totally puzzled on why I've got these results. How can I get the same output with Rcpp + regression as with the R loop +/- regression?

Denis

NumericMatrix define_e(int iter, NumericMatrix obs_mat, NumericVector draws) {
  Rcpp::NumericVector d = obs_mat(_, 1);
  Rcpp::NumericVector pr = no_init(d.length());
  Rcpp::NumericMatrix e(d.length(), iter);
  for (int i = 0; i < iter; i++) {
    pr = obs_mat(_, 0) * draws(i);
    e(_, i) = cpprbinom(d.length(), 1, pr);
  }
  return e;
}

NumericMatrix define_ereg1(int iter, NumericMatrix obs_mat, NumericVector draws) {
  Environment pkg = Environment::namespace_env("fastglm");
  Function f = pkg["fastglm"];

  Rcpp::NumericVector d = obs_mat(_, 1);
  Rcpp::NumericVector pr = no_init(d.length());
  Rcpp::NumericMatrix e(d.length(), iter);
  Rcpp::NumericMatrix ematrix(d.length(), 2);
  Rcpp::NumericVector v(d.length(), 1);
  ematrix(_, 0) = v;
  Rcpp::List mod_pois;
  for (int i = 0; i < iter; i++) {
    pr = obs_mat(_, 0) * draws(i);
    e(_, i) = cpprbinom(d.length(), 1, pr);

    ematrix(_, 1) = e;
mod_pois = f(Named("x", ematrix), Named("y", d), Named("family", "poisson"));
  }
  return e;
}

NumericMatrix define_ereg2(int iter, NumericMatrix obs_mat, NumericVector draws) {
  Environment pkg = Environment::namespace_env("Rfast");
  Function f = pkg["glm_poisson"];

  Rcpp::NumericVector d = obs_mat(_, 1);
  Rcpp::NumericVector pr = no_init(d.length());
  Rcpp::NumericMatrix e(d.length(), iter);
  Rcpp::NumericMatrix ematrix(d.length(), 2);
  Rcpp::NumericVector v(d.length(), 1);
  ematrix(_, 0) = v;
  Rcpp::List mod_pois;
  for (int i = 0; i < iter; i++) {
    pr = obs_mat(_, 0) * draws(i);
    e(_, i) = cpprbinom(d.length(), 1, pr);

    ematrix(_, 1) = e;
    mod_pois = f(Named("x", ematrix(_, 1)), Named("y", d));
  }
  return e;
}

NumericMatrix define_ereg3(int iter, NumericMatrix obs_mat, NumericVector draws) {
  Environment pkg = Environment::namespace_env("RcppArmadillo");
  Function f = pkg["fastLm"];

  Rcpp::NumericVector d = obs_mat(_, 1);
  Rcpp::NumericVector pr = no_init(d.length());
  Rcpp::NumericMatrix e(d.length(), iter);
  Rcpp::NumericMatrix ematrix(d.length(), 2);
  Rcpp::NumericVector v(d.length(), 1);
  ematrix(_, 0) = v;
  Rcpp::List mod_pois;
  for (int i = 0; i < iter; i++) {
    pr = obs_mat(_, 0) * draws(i);
    e(_, i) = cpprbinom(d.length(), 1, pr);

    ematrix(_, 1) = e;
    mod_pois = f(Named("X", ematrix), Named("y", d));
  }
  return e;
}

draws <- rbeta(200, 100, 60)
obs_mat <- cbind(c(rep(1, 40), rep(0, 60)), c(rep(1, 20), rep(0, 80)))
iter <- length(draws)
set.seed(1234)
t0 <- define_e(iter, obs_mat, draws)

e <- matrix(NA, nrow = nrow(obs_mat), ncol = iter)
e_obs <- obs_mat[, 1]
p <- rep(NA, nrow(obs_mat))
set.seed(1234)
for (i in 1:iter) {
      p <- e_obs * draws[i]
      e[, i] <- rbinom(nrow(obs_mat), 1, p)
}
all.equal(t0, e)  # TRUE

e1 <- matrix(NA, nrow = nrow(obs_mat), ncol = iter)
e_obs <- obs_mat[, 1]
p <- rep(NA, nrow(obs_mat))
set.seed(1234)
for (i in 1:iter) {
     p <- e_obs * draws[i]
     e1[, i] <- rbinom(nrow(obs_mat), 1, p)
     X <- cbind(rep(1, nrow(obs_mat)), e1[, i])
     d <- obs_mat[, 2]
     mod_pois <- glm(d ~ X, family = poisson(link = "log"))
}
all.equal(e, e1)  # TRUE
all.equal(e1, t0)  # TRUE

e2 <- matrix(NA, nrow = nrow(obs_mat), ncol = iter)
e_obs <- obs_mat[, 1]
p <- rep(NA, nrow(obs_mat))
set.seed(1234)
for (i in 1:iter) {
     p <- e_obs * draws[i]
     e2[, i] <- rbinom(nrow(obs_mat), 1, p)
     X <- cbind(rep(1, nrow(obs_mat)), e2[, i])
     d <- obs_mat[, 2]
     mod_pois <- fastglm::fastglm(X, d, family = poisson(link = "log"))
}
all.equal(e, e2)  # TRUE
all.equal(e2, t0)  # TRUE

set.seed(1234)
t1 <- define_ereg1(iter, obs_mat, draws)
all.equal(t1, e)  # [1] "Mean relative difference: 1.788299"

set.seed(1234)
t2 <- define_ereg2(iter, obs_mat, draws)
all.equal(t2, e)  # [1] "Mean relative difference: 1.788299"
all.equal(t2, t1)  # TRUE

set.seed(1234)
t3 <- define_ereg3(iter, obs_mat, draws)
all.equal(t3, e)  # "Mean relative difference: 1.788299"
all.equal(t3, t1)  # TRUE
all.equal(t3, t2)  # TRUE

Le 2024-10-01 13:19, Dirk Eddelbuettel a écrit :

On 1 October 2024 at 11:57, Denis Haine wrote:
| But adding the regression to define_e1() (as define_e2() in previous email) | does not give the same. The first iteration in the loop is identical in both, | but not the subsequent ones. That's why I was thinking it could be linked to | the seed, but (sorry for my limited knowledge of C++) it would be strange to | manually "re-seed" at each iteration. Moreover, I believe the seed set in R is | transferred to Rcpp and so does not need to be taken specifically care of.

Nice. So it seems to me that you identified fastglm as the source of your problem. When we call it, your results differ and there is a side-effect.

But when we don't it is still true. I added

Environment pkg2 = Environment::namespace_env("base");
Function f2 = pkg2["dim"];

// ...

RObject ro = f2(ematrix);

while keeping the lines relevant to fastglm calls and result storage
commented out.  Now we get TRUE from all.equal(t0,t1).

That seems to rule out Rcpp::Function, and may indicate you need to look into
what fastglm may be doing.

Cheers, Dirk
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Reply via email to