Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3

2016-01-29 Thread Dirk Eddelbuettel

On 28 January 2016 at 21:47, Paul Johnson wrote:
| Thanks.
| 
| On Thu, Jan 28, 2016 at 2:42 PM, Dirk Eddelbuettel  wrote:
| >
| > Paul,
| >
| > I can reproduce the segfault on Ubuntu 15.10, "everything current".
| > Definitely a valid bug report, though a 30mb xlsx may not qualify as
| > minimal.
| >
| I'm glad I did not attach it to an email, then :)
| 
| The data xlsx provided by the client is about 2 times as big, I had a
| GRA whittled it down for your entertainment.
| 
| If we whittle xlsx file down to a few lines, it does not seg fault, 
apparently.
| 
| I'm going crazy trying to downgrade R in Ubuntu see where that leads.
| I was using 3.2.2 until a couple of days ago and I never saw a hint of
| trouble from Rcpp or openxlsx.

Sorry that is so frustrating, I know.  Maybe using clang++ can help as it did
for KK on OS X.  (Though clang++ remains frustrating on Ubuntu as you have to
fiddle with -I... switches.)

Dirk
| 
| > I won't have time to look at this for a while though so if you find that
| > downgrading helps that may be your best bet.
| >
| > Thanks for the report.  I am so used to simple segfaults from ABI mixings
| > (g++-5.* will do that for you...) that I called this wrongly at first.
| >
| > Dirk
| >
| > /tmp/pj/openxlsx_failure$ Rscript
| > Reproducible_openxlsx_failure.R
| > R version 3.2.3 (2015-12-10)
| > Platform: x86_64-pc-linux-gnu (64-bit)
| > Running under: Ubuntu 15.10
| >
| > locale:
| >  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
| >  LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
| >  LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8
| >   [8] LC_NAME=C  LC_ADDRESS=C   LC_TELEPHONE=C
| >  LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
| >
| > attached base packages:
| > [1] stats graphics  grDevices utils datasets  base
| >
| > other attached packages:
| > [1] openxlsx_3.0.0
| >
| > loaded via a namespace (and not attached):
| > [1] Rcpp_0.12.3.1 methods_3.2.3
| >
| >  *** caught segfault ***
| >  address 0x7fd5b83d8038, cause 'memory not mapped'
| >
| > Traceback:
| >  1: .Call("openxlsx_readWorkbook", v, r, string_refs, isDate, nRows,
| >  colNames, skipEmptyRows, origin, clean_names, PACKAGE = "openxlsx")
| >  2: read.xlsx.default("Failure_to_Import.xlsx", colNames = TRUE)
| >  3: read.xlsx("Failure_to_Import.xlsx", colNames = TRUE)
| > aborting ...
| >
| >
| > --
| > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
| 
| 
| 
| -- 
| Paul E. Johnson
| Professor, Political ScienceDirector
| 1541 Lilac Lane, Room 504  Center for Research Methods
| University of Kansas University of Kansas
| http://pj.freefaculty.org  http://crmda.ku.edu

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3

2016-01-29 Thread Kevin Ushey
When I add some debug printing to the associated subscripting line
(https://github.com/awalker89/openxlsx/blob/b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608),
I see:

   colNumbers.size(): 98,03,150
   charCols.size(): 95,94,546

It looks to me like the package is erroneously attempting to subset
vectors of different sizes, causing out-of-bounds reads.
Unfortunately, Rcpp is not detecting or warning about this...

Either way, I believe this is a bug in the openxlsx package, but Rcpp
should be checking / reporting this.

Kevin

On Fri, Jan 29, 2016 at 4:52 AM, Dirk Eddelbuettel  wrote:
>
> On 28 January 2016 at 21:47, Paul Johnson wrote:
> | Thanks.
> |
> | On Thu, Jan 28, 2016 at 2:42 PM, Dirk Eddelbuettel  wrote:
> | >
> | > Paul,
> | >
> | > I can reproduce the segfault on Ubuntu 15.10, "everything current".
> | > Definitely a valid bug report, though a 30mb xlsx may not qualify as
> | > minimal.
> | >
> | I'm glad I did not attach it to an email, then :)
> |
> | The data xlsx provided by the client is about 2 times as big, I had a
> | GRA whittled it down for your entertainment.
> |
> | If we whittle xlsx file down to a few lines, it does not seg fault, 
> apparently.
> |
> | I'm going crazy trying to downgrade R in Ubuntu see where that leads.
> | I was using 3.2.2 until a couple of days ago and I never saw a hint of
> | trouble from Rcpp or openxlsx.
>
> Sorry that is so frustrating, I know.  Maybe using clang++ can help as it did
> for KK on OS X.  (Though clang++ remains frustrating on Ubuntu as you have to
> fiddle with -I... switches.)
>
> Dirk
> |
> | > I won't have time to look at this for a while though so if you find that
> | > downgrading helps that may be your best bet.
> | >
> | > Thanks for the report.  I am so used to simple segfaults from ABI mixings
> | > (g++-5.* will do that for you...) that I called this wrongly at first.
> | >
> | > Dirk
> | >
> | > /tmp/pj/openxlsx_failure$ Rscript
> | > Reproducible_openxlsx_failure.R
> | > R version 3.2.3 (2015-12-10)
> | > Platform: x86_64-pc-linux-gnu (64-bit)
> | > Running under: Ubuntu 15.10
> | >
> | > locale:
> | >  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> | >  LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
> | >  LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
> LC_PAPER=en_US.UTF-8
> | >   [8] LC_NAME=C  LC_ADDRESS=C   LC_TELEPHONE=C
> | >  LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> | >
> | > attached base packages:
> | > [1] stats graphics  grDevices utils datasets  base
> | >
> | > other attached packages:
> | > [1] openxlsx_3.0.0
> | >
> | > loaded via a namespace (and not attached):
> | > [1] Rcpp_0.12.3.1 methods_3.2.3
> | >
> | >  *** caught segfault ***
> | >  address 0x7fd5b83d8038, cause 'memory not mapped'
> | >
> | > Traceback:
> | >  1: .Call("openxlsx_readWorkbook", v, r, string_refs, isDate, nRows,
> | >  colNames, skipEmptyRows, origin, clean_names, PACKAGE = "openxlsx")
> | >  2: read.xlsx.default("Failure_to_Import.xlsx", colNames = TRUE)
> | >  3: read.xlsx("Failure_to_Import.xlsx", colNames = TRUE)
> | > aborting ...
> | >
> | >
> | > --
> | > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> |
> |
> |
> | --
> | Paul E. Johnson
> | Professor, Political ScienceDirector
> | 1541 Lilac Lane, Room 504  Center for Research Methods
> | University of Kansas University of Kansas
> | http://pj.freefaculty.org  http://crmda.ku.edu
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> ___
> Rcpp-devel mailing list
> Rcpp-devel@lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3

2016-01-29 Thread Dirk Eddelbuettel

On 29 January 2016 at 11:27, Kevin Ushey wrote:
| When I add some debug printing to the associated subscripting line
| 
(https://github.com/awalker89/openxlsx/blob/b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608),
| I see:
| 
|colNumbers.size(): 98,03,150
|charCols.size(): 95,94,546
| 
| It looks to me like the package is erroneously attempting to subset
| vectors of different sizes, causing out-of-bounds reads.

Nice work.

| Unfortunately, Rcpp is not detecting or warning about this...
| 
| Either way, I believe this is a bug in the openxlsx package, but Rcpp
| should be checking / reporting this.

With (Rcpp)Armadillo you do have an option of turning this on/off. With Rcpp
alone not quite.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3

2016-01-29 Thread Qiang Kou
Hi, Kevin, I was also trying to track this down yesterday.

>From the debugging info below, *indices_n* is not equal to length of
*indices*, which I don't quite understand.

Program received signal SIGSEGV, Segmentation fault.

0x72ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13,
true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13,
Rcpp::PreserveStorage> > >::get_vec (this=this@entry=0x7fff79a0)

at
/usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/Subsetter.h:200

199 output[i] = lhs[ indices[i] ];

(gdb) p i

$1 = 33622

(gdb) p indices[i]

Cannot access memory at address 0x34c6e000

(gdb) p indices_n

$2 = 9594546

On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel  wrote:

>
> On 29 January 2016 at 11:27, Kevin Ushey wrote:
> | When I add some debug printing to the associated subscripting line
> | (
> https://github.com/awalker89/openxlsx/blob/b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608
> ),
> | I see:
> |
> |colNumbers.size(): 98,03,150
> |charCols.size(): 95,94,546
> |
> | It looks to me like the package is erroneously attempting to subset
> | vectors of different sizes, causing out-of-bounds reads.
>
> Nice work.
>
> | Unfortunately, Rcpp is not detecting or warning about this...
> |
> | Either way, I believe this is a bug in the openxlsx package, but Rcpp
> | should be checking / reporting this.
>
> With (Rcpp)Armadillo you do have an option of turning this on/off. With
> Rcpp
> alone not quite.
>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> ___
> Rcpp-devel mailing list
> Rcpp-devel@lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>



-- 
Qiang Kou
q...@umail.iu.edu
School of Informatics and Computing, Indiana University
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3

2016-01-29 Thread Qiang Kou
Hi, Paul, can you try my fork of Rcpp? You can install it by the line below:

devtools::install_github("thirdwing/Rcpp", ref = "subsetter")

This fixed the segfault on my Ubuntu machine.

The difference can be found from [1].

In *subsetter*, if an IntegerVector passed in, we will try to reuse it.
This led to a segfault in this case, which I don't know why.

Dirk and Kevin, do you have any thoughts on it?

Best wishes,

KK

[1]
https://github.com/thirdwing/Rcpp/commit/216c5220bcb84778a656b3496d0f1803b973ef61


On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou  wrote:

> Hi, Kevin, I was also trying to track this down yesterday.
>
> From the debugging info below, *indices_n* is not equal to length of
> *indices*, which I don't quite understand.
>
> Program received signal SIGSEGV, Segmentation fault.
>
> 0x72ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13,
> true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13,
> Rcpp::PreserveStorage> > >::get_vec (this=this@entry=0x7fff79a0)
>
> at
> /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/Subsetter.h:200
>
> 199 output[i] = lhs[ indices[i] ];
>
> (gdb) p i
>
> $1 = 33622
>
> (gdb) p indices[i]
>
> Cannot access memory at address 0x34c6e000
>
> (gdb) p indices_n
>
> $2 = 9594546
>
> On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel  wrote:
>
>>
>> On 29 January 2016 at 11:27, Kevin Ushey wrote:
>> | When I add some debug printing to the associated subscripting line
>> | (
>> https://github.com/awalker89/openxlsx/blob/b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608
>> ),
>> | I see:
>> |
>> |colNumbers.size(): 98,03,150
>> |charCols.size(): 95,94,546
>> |
>> | It looks to me like the package is erroneously attempting to subset
>> | vectors of different sizes, causing out-of-bounds reads.
>>
>> Nice work.
>>
>> | Unfortunately, Rcpp is not detecting or warning about this...
>> |
>> | Either way, I believe this is a bug in the openxlsx package, but Rcpp
>> | should be checking / reporting this.
>>
>> With (Rcpp)Armadillo you do have an option of turning this on/off. With
>> Rcpp
>> alone not quite.
>>
>> Dirk
>>
>> --
>> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>> ___
>> Rcpp-devel mailing list
>> Rcpp-devel@lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>
>
>
> --
> Qiang Kou
> q...@umail.iu.edu
> School of Informatics and Computing, Indiana University
>
>


-- 
Qiang Kou
q...@umail.iu.edu
School of Informatics and Computing, Indiana University
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3

2016-01-29 Thread Dirk Eddelbuettel

On 29 January 2016 at 17:55, Qiang Kou wrote:
| Hi, Paul, can you try my fork of Rcpp? You can install it by the line below:
| 
| devtools::install_github("thirdwing/Rcpp", ref = "subsetter")
| 
| This fixed the segfault on my Ubuntu machine.

Yay. Nice work!
 
| The difference can be found from [1].

Nice and concise.
 
| In subsetter, if an IntegerVector passed in, we will try to reuse it. This led
| to a segfault in this case, which I don't know why.
| 
| Dirk and Kevin, do you have any thoughts on it?

Not really, but happy to give this the full reverse-dependency check
treatment so that we can merge it.

Dirk

 
| Best wishes,
| 
| KK
| 
| [1] https://github.com/thirdwing/Rcpp/commit/
| 216c5220bcb84778a656b3496d0f1803b973ef61
| 
| 
| On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou  wrote:
| 
| 
| Hi, Kevin, I was also trying to track this down yesterday.
| 
| From the debugging info below, indices_n is not equal to length of 
indices,
| which I don't quite understand.
| 
| Program received signal SIGSEGV, Segmentation fault.
| 
| 0x72ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13,
| true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13,
| Rcpp::PreserveStorage> > >::get_vec (this=this@entry=0x7fff79a0)
| 
|     at /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/
| Subsetter.h:200
| 
| 199             output[i] = lhs[ indices[i] ];
| 
| (gdb) p i
| 
| $1 = 33622
| 
| (gdb) p indices[i]
| 
| Cannot access memory at address 0x34c6e000
| 
| (gdb) p indices_n
| 
| $2 = 9594546
| 
| 
| On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel  
wrote:
| 
|
| On 29 January 2016 at 11:27, Kevin Ushey wrote:
| | When I add some debug printing to the associated subscripting line
| | (https://github.com/awalker89/openxlsx/blob/
| b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608),
| | I see:
| |
| |    colNumbers.size(): 98,03,150
| |    charCols.size(): 95,94,546
| |
| | It looks to me like the package is erroneously attempting to subset
| | vectors of different sizes, causing out-of-bounds reads.
| 
| Nice work.
|
| | Unfortunately, Rcpp is not detecting or warning about this...
| |
| | Either way, I believe this is a bug in the openxlsx package, but 
Rcpp
| | should be checking / reporting this.
| 
| With (Rcpp)Armadillo you do have an option of turning this on/off. 
With
| Rcpp
| alone not quite.
|
| Dirk
|
| --
| http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
| ___
| Rcpp-devel mailing list
| Rcpp-devel@lists.r-forge.r-project.org
| 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| 
| 
| 
| 
| --
| Qiang Kou
| q...@umail.iu.edu
| School of Informatics and Computing, Indiana University
| 
| 
| 
| 
| 
| --
| Qiang Kou
| q...@umail.iu.edu
| School of Informatics and Computing, Indiana University
| 

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3

2016-01-29 Thread Kevin Ushey
Hmm, I have one thought: we try to re-use the indices from an
IntegerVector, but the type here is actually a sugar type:

Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, true,
Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13,
Rcpp::PreserveStorage> > >::get_vec (this=)

Ie, we access the indices with `INTEGER(x)`, but perhaps those indices
have not actually been properly materialized when we attempt to
perform the subset?

But then, a simple test case with `x[y - 1]` with `x` and `y` both
being integer vectors seems to work just fine. So I am a bit confused.

On Fri, Jan 29, 2016 at 3:16 PM, Dirk Eddelbuettel  wrote:
>
> On 29 January 2016 at 17:55, Qiang Kou wrote:
> | Hi, Paul, can you try my fork of Rcpp? You can install it by the line below:
> |
> | devtools::install_github("thirdwing/Rcpp", ref = "subsetter")
> |
> | This fixed the segfault on my Ubuntu machine.
>
> Yay. Nice work!
>
> | The difference can be found from [1].
>
> Nice and concise.
>
> | In subsetter, if an IntegerVector passed in, we will try to reuse it. This 
> led
> | to a segfault in this case, which I don't know why.
> |
> | Dirk and Kevin, do you have any thoughts on it?
>
> Not really, but happy to give this the full reverse-dependency check
> treatment so that we can merge it.
>
> Dirk
>
>
> | Best wishes,
> |
> | KK
> |
> | [1] https://github.com/thirdwing/Rcpp/commit/
> | 216c5220bcb84778a656b3496d0f1803b973ef61
> |
> |
> | On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou  wrote:
> |
> |
> | Hi, Kevin, I was also trying to track this down yesterday.
> |
> | From the debugging info below, indices_n is not equal to length of 
> indices,
> | which I don't quite understand.
> |
> | Program received signal SIGSEGV, Segmentation fault.
> |
> | 0x72ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13,
> | true, Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13,
> | Rcpp::PreserveStorage> > >::get_vec (this=this@entry=0x7fff79a0)
> |
> | at /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/
> | Subsetter.h:200
> |
> | 199 output[i] = lhs[ indices[i] ];
> |
> | (gdb) p i
> |
> | $1 = 33622
> |
> | (gdb) p indices[i]
> |
> | Cannot access memory at address 0x34c6e000
> |
> | (gdb) p indices_n
> |
> | $2 = 9594546
> |
> |
> | On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel  
> wrote:
> |
> |
> | On 29 January 2016 at 11:27, Kevin Ushey wrote:
> | | When I add some debug printing to the associated subscripting line
> | | (https://github.com/awalker89/openxlsx/blob/
> | 
> b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608),
> | | I see:
> | |
> | |colNumbers.size(): 98,03,150
> | |charCols.size(): 95,94,546
> | |
> | | It looks to me like the package is erroneously attempting to 
> subset
> | | vectors of different sizes, causing out-of-bounds reads.
> |
> | Nice work.
> |
> | | Unfortunately, Rcpp is not detecting or warning about this...
> | |
> | | Either way, I believe this is a bug in the openxlsx package, but 
> Rcpp
> | | should be checking / reporting this.
> |
> | With (Rcpp)Armadillo you do have an option of turning this on/off. 
> With
> | Rcpp
> | alone not quite.
> |
> | Dirk
> |
> | --
> | http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> | ___
> | Rcpp-devel mailing list
> | Rcpp-devel@lists.r-forge.r-project.org
> | 
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> |
> |
> |
> |
> | --
> | Qiang Kou
> | q...@umail.iu.edu
> | School of Informatics and Computing, Indiana University
> |
> |
> |
> |
> |
> | --
> | Qiang Kou
> | q...@umail.iu.edu
> | School of Informatics and Computing, Indiana University
> |
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel