Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3
On 28 January 2016 at 21:47, Paul Johnson wrote: | Thanks. | | On Thu, Jan 28, 2016 at 2:42 PM, Dirk Eddelbuettel wrote: | > | > Paul, | > | > I can reproduce the segfault on Ubuntu 15.10, "everything current". | > Definitely a valid bug report, though a 30mb xlsx may not qualify as | > minimal. | > | I'm glad I did not attach it to an email, then :) | | The data xlsx provided by the client is about 2 times as big, I had a | GRA whittled it down for your entertainment. | | If we whittle xlsx file down to a few lines, it does not seg fault, apparently. | | I'm going crazy trying to downgrade R in Ubuntu see where that leads. | I was using 3.2.2 until a couple of days ago and I never saw a hint of | trouble from Rcpp or openxlsx. Sorry that is so frustrating, I know. Maybe using clang++ can help as it did for KK on OS X. (Though clang++ remains frustrating on Ubuntu as you have to fiddle with -I... switches.) Dirk | | > I won't have time to look at this for a while though so if you find that | > downgrading helps that may be your best bet. | > | > Thanks for the report. I am so used to simple segfaults from ABI mixings | > (g++-5.* will do that for you...) that I called this wrongly at first. | > | > Dirk | > | > /tmp/pj/openxlsx_failure$ Rscript | > Reproducible_openxlsx_failure.R | > R version 3.2.3 (2015-12-10) | > Platform: x86_64-pc-linux-gnu (64-bit) | > Running under: Ubuntu 15.10 | > | > locale: | > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C | > LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 | > LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8 | > [8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C | > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C | > | > attached base packages: | > [1] stats graphics grDevices utils datasets base | > | > other attached packages: | > [1] openxlsx_3.0.0 | > | > loaded via a namespace (and not attached): | > [1] Rcpp_0.12.3.1 methods_3.2.3 | > | > *** caught segfault *** | > address 0x7fd5b83d8038, cause 'memory not mapped' | > | > Traceback: | > 1: .Call("openxlsx_readWorkbook", v, r, string_refs, isDate, nRows, | > colNames, skipEmptyRows, origin, clean_names, PACKAGE = "openxlsx") | > 2: read.xlsx.default("Failure_to_Import.xlsx", colNames = TRUE) | > 3: read.xlsx("Failure_to_Import.xlsx", colNames = TRUE) | > aborting ... | > | > | > -- | > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org | | | | -- | Paul E. Johnson | Professor, Political ScienceDirector | 1541 Lilac Lane, Room 504 Center for Research Methods | University of Kansas University of Kansas | http://pj.freefaculty.org http://crmda.ku.edu -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3
When I add some debug printing to the associated subscripting line (https://github.com/awalker89/openxlsx/blob/b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608), I see: colNumbers.size(): 98,03,150 charCols.size(): 95,94,546 It looks to me like the package is erroneously attempting to subset vectors of different sizes, causing out-of-bounds reads. Unfortunately, Rcpp is not detecting or warning about this... Either way, I believe this is a bug in the openxlsx package, but Rcpp should be checking / reporting this. Kevin On Fri, Jan 29, 2016 at 4:52 AM, Dirk Eddelbuettel wrote: > > On 28 January 2016 at 21:47, Paul Johnson wrote: > | Thanks. > | > | On Thu, Jan 28, 2016 at 2:42 PM, Dirk Eddelbuettel wrote: > | > > | > Paul, > | > > | > I can reproduce the segfault on Ubuntu 15.10, "everything current". > | > Definitely a valid bug report, though a 30mb xlsx may not qualify as > | > minimal. > | > > | I'm glad I did not attach it to an email, then :) > | > | The data xlsx provided by the client is about 2 times as big, I had a > | GRA whittled it down for your entertainment. > | > | If we whittle xlsx file down to a few lines, it does not seg fault, > apparently. > | > | I'm going crazy trying to downgrade R in Ubuntu see where that leads. > | I was using 3.2.2 until a couple of days ago and I never saw a hint of > | trouble from Rcpp or openxlsx. > > Sorry that is so frustrating, I know. Maybe using clang++ can help as it did > for KK on OS X. (Though clang++ remains frustrating on Ubuntu as you have to > fiddle with -I... switches.) > > Dirk > | > | > I won't have time to look at this for a while though so if you find that > | > downgrading helps that may be your best bet. > | > > | > Thanks for the report. I am so used to simple segfaults from ABI mixings > | > (g++-5.* will do that for you...) that I called this wrongly at first. > | > > | > Dirk > | > > | > /tmp/pj/openxlsx_failure$ Rscript > | > Reproducible_openxlsx_failure.R > | > R version 3.2.3 (2015-12-10) > | > Platform: x86_64-pc-linux-gnu (64-bit) > | > Running under: Ubuntu 15.10 > | > > | > locale: > | > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > | > LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 > | > LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 > LC_PAPER=en_US.UTF-8 > | > [8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C > | > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > | > > | > attached base packages: > | > [1] stats graphics grDevices utils datasets base > | > > | > other attached packages: > | > [1] openxlsx_3.0.0 > | > > | > loaded via a namespace (and not attached): > | > [1] Rcpp_0.12.3.1 methods_3.2.3 > | > > | > *** caught segfault *** > | > address 0x7fd5b83d8038, cause 'memory not mapped' > | > > | > Traceback: > | > 1: .Call("openxlsx_readWorkbook", v, r, string_refs, isDate, nRows, > | > colNames, skipEmptyRows, origin, clean_names, PACKAGE = "openxlsx") > | > 2: read.xlsx.default("Failure_to_Import.xlsx", colNames = TRUE) > | > 3: read.xlsx("Failure_to_Import.xlsx", colNames = TRUE) > | > aborting ... > | > > | > > | > -- > | > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > | > | > | > | -- > | Paul E. Johnson > | Professor, Political ScienceDirector > | 1541 Lilac Lane, Room 504 Center for Research Methods > | University of Kansas University of Kansas > | http://pj.freefaculty.org http://crmda.ku.edu > > -- > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > ___ > Rcpp-devel mailing list > Rcpp-devel@lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3
On 29 January 2016 at 11:27, Kevin Ushey wrote: | When I add some debug printing to the associated subscripting line | (https://github.com/awalker89/openxlsx/blob/b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608), | I see: | |colNumbers.size(): 98,03,150 |charCols.size(): 95,94,546 | | It looks to me like the package is erroneously attempting to subset | vectors of different sizes, causing out-of-bounds reads. Nice work. | Unfortunately, Rcpp is not detecting or warning about this... | | Either way, I believe this is a bug in the openxlsx package, but Rcpp | should be checking / reporting this. With (Rcpp)Armadillo you do have an option of turning this on/off. With Rcpp alone not quite. Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3
Hi, Kevin, I was also trying to track this down yesterday. >From the debugging info below, *indices_n* is not equal to length of *indices*, which I don't quite understand. Program received signal SIGSEGV, Segmentation fault. 0x72ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13, Rcpp::PreserveStorage> > >::get_vec (this=this@entry=0x7fff79a0) at /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/Subsetter.h:200 199 output[i] = lhs[ indices[i] ]; (gdb) p i $1 = 33622 (gdb) p indices[i] Cannot access memory at address 0x34c6e000 (gdb) p indices_n $2 = 9594546 On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel wrote: > > On 29 January 2016 at 11:27, Kevin Ushey wrote: > | When I add some debug printing to the associated subscripting line > | ( > https://github.com/awalker89/openxlsx/blob/b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608 > ), > | I see: > | > |colNumbers.size(): 98,03,150 > |charCols.size(): 95,94,546 > | > | It looks to me like the package is erroneously attempting to subset > | vectors of different sizes, causing out-of-bounds reads. > > Nice work. > > | Unfortunately, Rcpp is not detecting or warning about this... > | > | Either way, I believe this is a bug in the openxlsx package, but Rcpp > | should be checking / reporting this. > > With (Rcpp)Armadillo you do have an option of turning this on/off. With > Rcpp > alone not quite. > > Dirk > > -- > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > ___ > Rcpp-devel mailing list > Rcpp-devel@lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel > -- Qiang Kou q...@umail.iu.edu School of Informatics and Computing, Indiana University ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3
Hi, Paul, can you try my fork of Rcpp? You can install it by the line below: devtools::install_github("thirdwing/Rcpp", ref = "subsetter") This fixed the segfault on my Ubuntu machine. The difference can be found from [1]. In *subsetter*, if an IntegerVector passed in, we will try to reuse it. This led to a segfault in this case, which I don't know why. Dirk and Kevin, do you have any thoughts on it? Best wishes, KK [1] https://github.com/thirdwing/Rcpp/commit/216c5220bcb84778a656b3496d0f1803b973ef61 On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou wrote: > Hi, Kevin, I was also trying to track this down yesterday. > > From the debugging info below, *indices_n* is not equal to length of > *indices*, which I don't quite understand. > > Program received signal SIGSEGV, Segmentation fault. > > 0x72ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, > true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13, > Rcpp::PreserveStorage> > >::get_vec (this=this@entry=0x7fff79a0) > > at > /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/Subsetter.h:200 > > 199 output[i] = lhs[ indices[i] ]; > > (gdb) p i > > $1 = 33622 > > (gdb) p indices[i] > > Cannot access memory at address 0x34c6e000 > > (gdb) p indices_n > > $2 = 9594546 > > On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel wrote: > >> >> On 29 January 2016 at 11:27, Kevin Ushey wrote: >> | When I add some debug printing to the associated subscripting line >> | ( >> https://github.com/awalker89/openxlsx/blob/b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608 >> ), >> | I see: >> | >> |colNumbers.size(): 98,03,150 >> |charCols.size(): 95,94,546 >> | >> | It looks to me like the package is erroneously attempting to subset >> | vectors of different sizes, causing out-of-bounds reads. >> >> Nice work. >> >> | Unfortunately, Rcpp is not detecting or warning about this... >> | >> | Either way, I believe this is a bug in the openxlsx package, but Rcpp >> | should be checking / reporting this. >> >> With (Rcpp)Armadillo you do have an option of turning this on/off. With >> Rcpp >> alone not quite. >> >> Dirk >> >> -- >> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org >> ___ >> Rcpp-devel mailing list >> Rcpp-devel@lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel >> > > > > -- > Qiang Kou > q...@umail.iu.edu > School of Informatics and Computing, Indiana University > > -- Qiang Kou q...@umail.iu.edu School of Informatics and Computing, Indiana University ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3
On 29 January 2016 at 17:55, Qiang Kou wrote: | Hi, Paul, can you try my fork of Rcpp? You can install it by the line below: | | devtools::install_github("thirdwing/Rcpp", ref = "subsetter") | | This fixed the segfault on my Ubuntu machine. Yay. Nice work! | The difference can be found from [1]. Nice and concise. | In subsetter, if an IntegerVector passed in, we will try to reuse it. This led | to a segfault in this case, which I don't know why. | | Dirk and Kevin, do you have any thoughts on it? Not really, but happy to give this the full reverse-dependency check treatment so that we can merge it. Dirk | Best wishes, | | KK | | [1] https://github.com/thirdwing/Rcpp/commit/ | 216c5220bcb84778a656b3496d0f1803b973ef61 | | | On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou wrote: | | | Hi, Kevin, I was also trying to track this down yesterday. | | From the debugging info below, indices_n is not equal to length of indices, | which I don't quite understand. | | Program received signal SIGSEGV, Segmentation fault. | | 0x72ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, | true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13, | Rcpp::PreserveStorage> > >::get_vec (this=this@entry=0x7fff79a0) | | at /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/ | Subsetter.h:200 | | 199 output[i] = lhs[ indices[i] ]; | | (gdb) p i | | $1 = 33622 | | (gdb) p indices[i] | | Cannot access memory at address 0x34c6e000 | | (gdb) p indices_n | | $2 = 9594546 | | | On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel wrote: | | | On 29 January 2016 at 11:27, Kevin Ushey wrote: | | When I add some debug printing to the associated subscripting line | | (https://github.com/awalker89/openxlsx/blob/ | b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608), | | I see: | | | | colNumbers.size(): 98,03,150 | | charCols.size(): 95,94,546 | | | | It looks to me like the package is erroneously attempting to subset | | vectors of different sizes, causing out-of-bounds reads. | | Nice work. | | | Unfortunately, Rcpp is not detecting or warning about this... | | | | Either way, I believe this is a bug in the openxlsx package, but Rcpp | | should be checking / reporting this. | | With (Rcpp)Armadillo you do have an option of turning this on/off. With | Rcpp | alone not quite. | | Dirk | | -- | http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org | ___ | Rcpp-devel mailing list | Rcpp-devel@lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | | | | | -- | Qiang Kou | q...@umail.iu.edu | School of Informatics and Computing, Indiana University | | | | | | -- | Qiang Kou | q...@umail.iu.edu | School of Informatics and Computing, Indiana University | -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3
Hmm, I have one thought: we try to re-use the indices from an IntegerVector, but the type here is actually a sugar type: Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13, Rcpp::PreserveStorage> > >::get_vec (this=) Ie, we access the indices with `INTEGER(x)`, but perhaps those indices have not actually been properly materialized when we attempt to perform the subset? But then, a simple test case with `x[y - 1]` with `x` and `y` both being integer vectors seems to work just fine. So I am a bit confused. On Fri, Jan 29, 2016 at 3:16 PM, Dirk Eddelbuettel wrote: > > On 29 January 2016 at 17:55, Qiang Kou wrote: > | Hi, Paul, can you try my fork of Rcpp? You can install it by the line below: > | > | devtools::install_github("thirdwing/Rcpp", ref = "subsetter") > | > | This fixed the segfault on my Ubuntu machine. > > Yay. Nice work! > > | The difference can be found from [1]. > > Nice and concise. > > | In subsetter, if an IntegerVector passed in, we will try to reuse it. This > led > | to a segfault in this case, which I don't know why. > | > | Dirk and Kevin, do you have any thoughts on it? > > Not really, but happy to give this the full reverse-dependency check > treatment so that we can merge it. > > Dirk > > > | Best wishes, > | > | KK > | > | [1] https://github.com/thirdwing/Rcpp/commit/ > | 216c5220bcb84778a656b3496d0f1803b973ef61 > | > | > | On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou wrote: > | > | > | Hi, Kevin, I was also trying to track this down yesterday. > | > | From the debugging info below, indices_n is not equal to length of > indices, > | which I don't quite understand. > | > | Program received signal SIGSEGV, Segmentation fault. > | > | 0x72ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, > | true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13, > | Rcpp::PreserveStorage> > >::get_vec (this=this@entry=0x7fff79a0) > | > | at /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/ > | Subsetter.h:200 > | > | 199 output[i] = lhs[ indices[i] ]; > | > | (gdb) p i > | > | $1 = 33622 > | > | (gdb) p indices[i] > | > | Cannot access memory at address 0x34c6e000 > | > | (gdb) p indices_n > | > | $2 = 9594546 > | > | > | On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel > wrote: > | > | > | On 29 January 2016 at 11:27, Kevin Ushey wrote: > | | When I add some debug printing to the associated subscripting line > | | (https://github.com/awalker89/openxlsx/blob/ > | > b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608), > | | I see: > | | > | |colNumbers.size(): 98,03,150 > | |charCols.size(): 95,94,546 > | | > | | It looks to me like the package is erroneously attempting to > subset > | | vectors of different sizes, causing out-of-bounds reads. > | > | Nice work. > | > | | Unfortunately, Rcpp is not detecting or warning about this... > | | > | | Either way, I believe this is a bug in the openxlsx package, but > Rcpp > | | should be checking / reporting this. > | > | With (Rcpp)Armadillo you do have an option of turning this on/off. > With > | Rcpp > | alone not quite. > | > | Dirk > | > | -- > | http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > | ___ > | Rcpp-devel mailing list > | Rcpp-devel@lists.r-forge.r-project.org > | > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel > | > | > | > | > | -- > | Qiang Kou > | q...@umail.iu.edu > | School of Informatics and Computing, Indiana University > | > | > | > | > | > | -- > | Qiang Kou > | q...@umail.iu.edu > | School of Informatics and Computing, Indiana University > | > > -- > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel