And I now have a MRE: #include <Rcpp.h> using namespace Rcpp;
// [[Rcpp::export]] IntegerVector ouch(IntegerVector x, IntegerVector y) { IntegerVector result; result = x[y - 1]; return result; } /*** R x <- 1:1E6 y <- 1:1E6 gctorture(TRUE) x <- ouch(x, y) gctorture(FALSE) */ This fails pretty reliably for me. Perhaps we can add this to our unit tests. On Sat, Jan 30, 2016 at 1:44 PM, Kevin Ushey <kevinus...@gmail.com> wrote: > I think I know what's going on now. Effectively, we're 'evaluating' > the sugar proxy vector, getting a pointer to a (temporary, > unprotected) R data structure, and then storing that pointer. Later, > when we attempt to allocate the output vector (with `no_init`) the GC > runs and cleans up our pointer. > > See: > > > https://github.com/RcppCore/Rcpp/blob/master/inst/include/Rcpp/vector/Subsetter.h#L147 > > https://github.com/RcppCore/Rcpp/blob/master/inst/include/Rcpp/vector/Subsetter.h#L197 > > The 'best' fix here is to only re-use the indices for an actual > IntegerVector, and generate the indices from sugar proxies, but I'm > not exactly sure how to best handle that. It's also possible that if > we just allocated the output vector earlier (ie, before getting a > pointer to our subset proxy), that we'd avoid the R gc cleaning up our > vector. > > FWIW, we should be able to reproduce this reliably when running under > `gctorture()`... > > Kevin > > On Sat, Jan 30, 2016 at 7:07 AM, Qiang Kou <q...@umail.iu.edu> wrote: >> Another thing confused me is the segfault only happens when the >> IntegerVector is significantly long. >> >> Best, >> >> KK >> >> On Sat, Jan 30, 2016 at 2:57 AM, Kevin Ushey <kevinus...@gmail.com> wrote: >>> >>> Hmm, I have one thought: we try to re-use the indices from an >>> IntegerVector, but the type here is actually a sugar type: >>> >>> Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, true, >>> Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13, >>> Rcpp::PreserveStorage> > >::get_vec (this=<optimized out>) >>> >>> Ie, we access the indices with `INTEGER(x)`, but perhaps those indices >>> have not actually been properly materialized when we attempt to >>> perform the subset? >>> >>> But then, a simple test case with `x[y - 1]` with `x` and `y` both >>> being integer vectors seems to work just fine. So I am a bit confused. >>> >>> On Fri, Jan 29, 2016 at 3:16 PM, Dirk Eddelbuettel <e...@debian.org> wrote: >>> > >>> > On 29 January 2016 at 17:55, Qiang Kou wrote: >>> > | Hi, Paul, can you try my fork of Rcpp? You can install it by the line >>> > below: >>> > | >>> > | devtools::install_github("thirdwing/Rcpp", ref = "subsetter") >>> > | >>> > | This fixed the segfault on my Ubuntu machine. >>> > >>> > Yay. Nice work! >>> > >>> > | The difference can be found from [1]. >>> > >>> > Nice and concise. >>> > >>> > | In subsetter, if an IntegerVector passed in, we will try to reuse it. >>> > This led >>> > | to a segfault in this case, which I don't know why. >>> > | >>> > | Dirk and Kevin, do you have any thoughts on it? >>> > >>> > Not really, but happy to give this the full reverse-dependency check >>> > treatment so that we can merge it. >>> > >>> > Dirk >>> > >>> > >>> > | Best wishes, >>> > | >>> > | KK >>> > | >>> > | [1] https://github.com/thirdwing/Rcpp/commit/ >>> > | 216c5220bcb84778a656b3496d0f1803b973ef61 >>> > | >>> > | >>> > | On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou <q...@umail.iu.edu> wrote: >>> > | >>> > | >>> > | Hi, Kevin, I was also trying to track this down yesterday. >>> > | >>> > | From the debugging info below, indices_n is not equal to length of >>> > indices, >>> > | which I don't quite understand. >>> > | >>> > | Program received signal SIGSEGV, Segmentation fault. >>> > | >>> > | 0x00007ffff2ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, >>> > 13, >>> > | true, Rcpp::sugar::Minus_Vector_Primitive<13, true, >>> > Rcpp::Vector<13, >>> > | Rcpp::PreserveStorage> > >::get_vec >>> > (this=this@entry=0x7fffffff79a0) >>> > | >>> > | at /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/ >>> > | Subsetter.h:200 >>> > | >>> > | 199 output[i] = lhs[ indices[i] ]; >>> > | >>> > | (gdb) p i >>> > | >>> > | $1 = 33622 >>> > | >>> > | (gdb) p indices[i] >>> > | >>> > | Cannot access memory at address 0x34c6e000 >>> > | >>> > | (gdb) p indices_n >>> > | >>> > | $2 = 9594546 >>> > | >>> > | >>> > | On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel >>> > <e...@debian.org> wrote: >>> > | >>> > | >>> > | On 29 January 2016 at 11:27, Kevin Ushey wrote: >>> > | | When I add some debug printing to the associated >>> > subscripting line >>> > | | (https://github.com/awalker89/openxlsx/blob/ >>> > | >>> > b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608), >>> > | | I see: >>> > | | >>> > | | colNumbers.size(): 98,03,150 >>> > | | charCols.size(): 95,94,546 >>> > | | >>> > | | It looks to me like the package is erroneously attempting to >>> > subset >>> > | | vectors of different sizes, causing out-of-bounds reads. >>> > | >>> > | Nice work. >>> > | >>> > | | Unfortunately, Rcpp is not detecting or warning about >>> > this... >>> > | | >>> > | | Either way, I believe this is a bug in the openxlsx package, >>> > but Rcpp >>> > | | should be checking / reporting this. >>> > | >>> > | With (Rcpp)Armadillo you do have an option of turning this >>> > on/off. With >>> > | Rcpp >>> > | alone not quite. >>> > | >>> > | Dirk >>> > | >>> > | -- >>> > | http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org >>> > | _______________________________________________ >>> > | Rcpp-devel mailing list >>> > | Rcpp-devel@lists.r-forge.r-project.org >>> > | >>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel >>> > | >>> > | >>> > | >>> > | >>> > | -- >>> > | Qiang Kou >>> > | q...@umail.iu.edu >>> > | School of Informatics and Computing, Indiana University >>> > | >>> > | >>> > | >>> > | >>> > | >>> > | -- >>> > | Qiang Kou >>> > | q...@umail.iu.edu >>> > | School of Informatics and Computing, Indiana University >>> > | >>> > >>> > -- >>> > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org >> >> >> >> >> -- >> Qiang Kou >> q...@umail.iu.edu >> School of Informatics and Computing, Indiana University >> _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel