Re: [R] readLines without skipNul=TRUE causes crash

2017-07-18 Thread Martin Maechler
> Anthony Damico > on Sun, 16 Jul 2017 06:40:38 -0400 writes: > hi, the text file that prompts the segfault is 4gb but only 80,937 lines >> file.info( "S:/temp/crash.txt") > size isdir mode mtime > ctime atime exe >

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-17 Thread Anthony Damico
awesome, thank you! looks like folks on bugzilla have also reproduced and submitted a patch, so i am happy. thanks all On Mon, Jul 17, 2017 at 11:36 AM, William Dunlap wrote: > The original file had a lot of trailing null bytes so I tried making a > similar file with: > > tf

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-17 Thread William Dunlap via R-help
The original file had a lot of trailing null bytes so I tried making a similar file with: tf <- tempfile(); file <- file(tf, "wb") for(i in 1:(2^15-1))writeBin(rep(as.raw(32:127), len=2^16), file) for(i in 1:(2^15-1))writeBin(rep(as.raw(0L), len=2^16), file) close(file) log2(file.size(tf)) #[1]

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-17 Thread Jeff Newmiller
I'll pass. Just because some non-CRAN "archive" package has bugs or your disk storage is flaky does not mean that any of dozens or hundreds of other compression tools (e.g. the built-in Windows "Send to compressed folder" pop-up menu) won't get it right, and we would know if it did fail because

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-17 Thread Anthony Damico
hi, thanks again for taking the time. since corrupted compression prompted the segfault for me in the first place, i've just posted the text file as-is. it's a 2.4GB file so to be avoided on a metered internet connection. i've updated the bugzilla report at

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-16 Thread Jeff Newmiller
I am stuck. The archive package won't compile for me on Ubuntu, and the CRANextra repo seems to be down so I cannot install packages on Windows right now. Perhaps you can zip the corrupt text file and put it online somewhere? Don't use the archive package to pack it since there seem to be

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-16 Thread Anthony Damico
hi, yep, there are two problems -- but i think only the segfault is within the scope of a base R issue? i need to look closer at the corrupted decompression and figure out whether i should talk to the brazilian government agency that creates that .rar file or open an issue with the archive

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-16 Thread Jeff Newmiller
So you are saying there are two problems... one that produces a corrupt file from a valid compressed file, and one that segfaults when presented with that corrupt file? Can you please confirm the file name and run md5sum on it and share the result so we can tell when the file problem has been

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-16 Thread Anthony Damico
hi, the text file that prompts the segfault is 4gb but only 80,937 lines > file.info( "S:/temp/crash.txt") size isdir mode mtime ctime atime exe S:/temp/crash.txt 4078192743 FALSE 666 2017-07-15 17:24:35 2017-07-15 17:19:47 2017-07-15 17:19:47

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-16 Thread Duncan Murdoch
On 16/07/2017 6:17 AM, Anthony Damico wrote: thank you for taking the time to write this. i set it running last night and it's still going -- if it doesn't finish by tomorrow, i will try to find a site to host the problem file and add that link to the bug report so the archive package can be

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-16 Thread Anthony Damico
sorry, typo, 80937 not 809367 On Sun, Jul 16, 2017 at 6:21 AM, Anthony Damico wrote: > hi, thank you for attempting this. it looks like your unix machine > unzipped the txt file without corruption -- if you copied over the same txt > file to windows 7, i don't think that

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-16 Thread Anthony Damico
hi, thank you for attempting this. it looks like your unix machine unzipped the txt file without corruption -- if you copied over the same txt file to windows 7, i don't think that would reproduce the problem? i think it needs to be the corrupted text file where R.utils::countLines( txtfile )

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-16 Thread Anthony Damico
thank you for taking the time to write this. i set it running last night and it's still going -- if it doesn't finish by tomorrow, i will try to find a site to host the problem file and add that link to the bug report so the archive package can be avoided at least. i'm sorry for the bother On

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread William Dunlap via R-help
I see the problem on Windows 10, R-3.4.0, R.exe. It is not compiled for debugging but gdb gives some information when I attach the debugger after the 'R..has stopped working' popup appears. I don't know how reliable it is: (gdb) info threads Id Target Id Frame * 4Thread

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread Jeff Newmiller
I am not able to reproduce your segfault on a Windows 7 platform either: ## fn1 <- "d:/DADOS_ENEM_2009.txt" sessionInfo() ## R version 3.4.1 (2017-06-30) ## Platform: x86_64-w64-mingw32/x64 (64-bit) ## Running under: Windows 7 x64 (build 7601) Service Pack 1 ## ## Matrix

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread Duncan Murdoch
On 15/07/2017 11:33 AM, Anthony Damico wrote: hi, i realized that the segfault happens on the text file in a new R session. so, creating the segfault-generating text file requires a contributed package, but prompting the actual segfault does not -- pretty sure that means this is a base R bug?

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread Jeff Newmiller
I am not able to reproduce this on a Linux platform: ###3 fn1 <- "/home/jdnewmil/Downloads/Microdados ENEM 2009/Dados Enem 2009/DADOS_ENEM_2009.txt" sessionInfo() ## R version 3.4.1 (2017-06-30) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 14.04.5 LTS

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread Duncan Murdoch
On 15/07/2017 11:33 AM, Anthony Damico wrote: hi, i realized that the segfault happens on the text file in a new R session. so, creating the segfault-generating text file requires a contributed package, but prompting the actual segfault does not -- pretty sure that means this is a base R bug?

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread Anthony Damico
hi, i realized that the segfault happens on the text file in a new R session. so, creating the segfault-generating text file requires a contributed package, but prompting the actual segfault does not -- pretty sure that means this is a base R bug? submitted here:

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread Anthony Damico
hi, thanks Dr. Murdoch i'd appreciate if anyone on r-help could help me narrow this down? i believe the segfault occurs because there's a single line with 4GB and also embedded nuls, but i am not sure how to artificially construct that? the lodown package can be removed from my example.. it

Re: [R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread Duncan Murdoch
On 15/07/2017 7:35 AM, Anthony Damico wrote: hello, the last line of the code below causes a segfault for me on 3.4.1. i think i should submit to https://bugs.r-project.org/ unless others have advice? thanks Segfaults are usually worth reporting as bugs. Try to come up with a

[R] readLines without skipNul=TRUE causes crash

2017-07-15 Thread Anthony Damico
hello, the last line of the code below causes a segfault for me on 3.4.1. i think i should submit to https://bugs.r-project.org/ unless others have advice? thanks install.packages( "devtools" ) devtools::install_github("ajdamico/lodown") devtools::install_github("jimhester/archive")