Re: [Bioc-devel] Strange "internal logical NA value has been modified" error
Hi Hervé, On 10/13/21 12:43 PM, Hervé Pagès wrote: On 12/10/2021 15:43, Pariksheet Nanda wrote: The function in question is: replace_unstranded <- function (gr) { idx <- strand(gr) == "*" if (length(idx) == 0L) ^ Not related to the "internal logical NA value has been modified" error but shouldn't you be doing '!any(idx)' instead of 'length(idx) == 0L' here? Indeed. Although in a roundabout way the result somehow satisfied the unit tests, idx is a poor choice of name because it's really a mask, and your suggestion of OR-ing the mask FALSE values with any() is more intuitive. The name is_unstranded might be less cryptic than mask. Applying your suggestion of the correct condition uncovered a bug where return(gr) was returning the unsorted value, which is inconsistent with the behavior of the final statement returns a sorted value. So changed to return(sort(gr)) for a consistent contract. Fixed in f6892ea Best, H. return(gr) sort(c( gr[! idx], `strand<-`(gr[idx], value = "+"), `strand<-`(gr[idx], value = "-"))) } Pariksheet ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Strange "internal logical NA value has been modified" error
Hi Pariksheet, On 12/10/2021 15:43, Pariksheet Nanda wrote: The function in question is: replace_unstranded <- function (gr) { idx <- strand(gr) == "*" if (length(idx) == 0L) ^ Not related to the "internal logical NA value has been modified" error but shouldn't you be doing '!any(idx)' instead of 'length(idx) == 0L' here? Best, H. return(gr) sort(c( gr[! idx], `strand<-`(gr[idx], value = "+"), `strand<-`(gr[idx], value = "-"))) } -- Hervé Pagès Bioconductor Core Team hpages.on.git...@gmail.com ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Strange "internal logical NA value has been modified" error
The problem with using gdb is you'd find yourself in the garbage collector, but perhaps quite removed from where the corruption occurred, e.g., gc() might / will likely be triggered after you've returned to the top-level evaluation loop, and the part of your code that did the corruption might be off the stack. The problem with devtools::check() (and R CMD check) is that running the unit tests occurs in a separate process, so things like setting a global option (and even system variable from within R) may not be visible in the process doing the check. Conversely, for the same reasons, it seems like the problem can be tickled by running the tests alone. So R -f /tests/testthat.R would seem to be a good enough starting point. Actually, I liked Henrik's UBSAN suggestion, which requires the least amount of work. I think I'd then try R -d valgrind -f /tests/testthat.R and then further into the weeds... actually from the section of R-exts you mention R_C_BOUNDS_CHECK=yes R -f /tests/testthat.R might also be promising. Martin On 10/12/21, 10:30 PM, "Bioc-devel on behalf of Pariksheet Nanda" wrote: Hi all, On 10/12/21 6:43 PM, Pariksheet Nanda wrote: > > Error in `...`: internal logical NA value has been modified In the R source code, this error is in src/main/memory.c so I was thinking one way of investigating might be to run `R --debugger gdb`, then running R to load the symbols and either: 1) set a breakpoint for when it reaches that particular line in memory.c:R_gc_internal and then walk up the stack, 2) or set a watch point on memory.c:R_gc_internal:R_LogicalNAValue (somehow; having trouble getting gdb to reach that context). 3) Then I thought, maybe this is getting far into the weeds and instead I could check the most common C related error by enabling bounds checking of my C arrays per section 4.4 of the R-exts manual: $ R -q > options(CBoundsCheck = TRUE) > Sys.setenv(R_C_BOUNDS_CHECK = "yes") # Try both ways *shrug* > devtools::test() ... # All tests still pass. > devtools::check() ... # No change :( Maybe I'm not sure I'm using that option correctly? Or the option is ignored in devtools::check(). Or indeed, the error is not from over running C array boundaries. It turns out that using the precompiled debug symbols[1] isn't all that useful here because I don't get line numbers in gdb without the source files and many symbols are optimized out, so it looks like I would need to compile R from source with -ggdb first instead of using the Debian packages. Hopefully this is still the right approach? Pariksheet [1] After install r-base-core-dbg on Debian for the debug symbols. ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Strange "internal logical NA value has been modified" error
Thanks, Martin and Henrik! My previous confusing reply from a few minutes ago was due my university GMail hiding your replies in All Mail. I'll consider both your suggestions carefully and thank you again for the quick and thoughtful replies. Pairksheet On 10/12/21 8:03 PM, Henrik Bengtsson wrote: *Message sent from a system outside of UConn.* In addition to checking with Valgrind, the ASan/UBsan and rchk platforms on R-Hub (https://builder.r-hub.io/) can probably also be useful; rhub::check(platform = "linux-x86_64-rocker-gcc-san") rhub::check(platform = "ubuntu-rchk") /Henrik On Tue, Oct 12, 2021 at 4:54 PM Martin Morgan wrote: It is from base R https://github.com/wch/r-source/blob/a984cc29b9b8d8821f8eb2a1081d9e0d1d4df56e/src/main/memory.c#L3214 and likely indicates memory corruption, not necessarily in the code that triggers the error (this is when the garbage collector is triggered...). Probably in *your* C code :) since it's the least tested. Probably writing out of bounds. This could be quite tricky to debug. I'd try to get something close to a minimal reproducible example. I'd try to take devtools out of the picture, maybe running the test/testhat.R script from the command line using Rscript, or worst case creating a shell package that adds minimal code and can be checked with R CMD build --no-build-vignettes / R CMD check. You could try inserting gc() before / after the unit test; it might make it clear that the unit test isn't the problem. You could also try gctorture(TRUE); this will make your code run extremely painfully slowly, which puts a big premium on having a minimal reproducible example; you could put this near the code chunks that are causing problems. You might have success running under valgrind, something like R -d valgrind -f minimal_script.R. Hope those suggestions help! Martin On 10/12/21, 6:43 PM, "Bioc-devel on behalf of Pariksheet Nanda" wrote: Hi folks, I've been told to ask some of my more fun questions on this mailing list instead of Slack. I'm climbing the ladder of submitting my first Bioconductor package (https://gitlab.com/coregenomics/tsshmm) and feel like there are gremlins that keep adding rungs to the top of the ladder. The latest head scratcher from running devtools::check() is a unit test for a trivial 2 line function failing with this gem of an error: > test_check("tsshmm") ══ Failed tests ── Error (test-tss.R:11:5): replace_unstranded splits unstranded into + and - ── Error in `tryCatchOne(expr, names, parentenv, handlers[[1L]])`: internal logical NA value has been modified Backtrace: █ 1. ├─testthat::expect_equal(...) test-tss.R:11:4 2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg = "expected") 3. │ └─rlang::eval_bare(expr, quo_get_env(quo)) 4. └─GenomicRanges::GRanges(c("chr:100:+", "chr:100:-")) 5. └─methods::as(seqnames, "GRanges") 6. └─GenomicRanges:::asMethod(object) 7. └─GenomicRanges::GRanges(ans_seqnames, ans_ranges, ans_strand) 8. └─GenomicRanges:::new_GRanges(...) 9. └─S4Vectors:::normarg_mcols(mcols, Class, ans_len) 10. └─S4Vectors::make_zero_col_DFrame(x_len) 11. └─S4Vectors::new2("DFrame", nrows = nrow, check = FALSE) 12. └─methods::new(...) 13. ├─methods::initialize(value, ...) 14. └─methods::initialize(value, ...) 15. └─methods::validObject(.Object) 16. └─base::try(...) 17. └─base::tryCatch(...) 18. └─base:::tryCatchList(expr, classes, parentenv, handlers) 19. └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]]) [ FAIL 1 | WARN 0 | SKIP 0 | PASS 109 ] The full continuous integration log is here: https://gitlab.com/coregenomics/tsshmm/-/jobs/1673603868 The function in question is: replace_unstranded <- function (gr) { idx <- strand(gr) == "*" if (length(idx) == 0L) return(gr) sort(c( gr[! idx], `strand<-`(gr[idx], value = "+"), `strand<-`(gr[idx], value = "-"))) } Also online here: https://gitlab.com/coregenomics/tsshmm/-/blob/ef5e19a0e2f68fca93665bc417afbcfb6d437189/R/hmm.R#L170-178 ... and the unit test is: test_that("replace_unstranded splits unstranded into + and -", { expect_equal(replace_unstranded(GRanges("chr:100")), GRanges(c("chr:100:+", "chr:100:-"))) expect_equal(replace_unstranded(GRanges(c("chr:100", "chr:200:+"))),
Re: [Bioc-devel] Strange "internal logical NA value has been modified" error
Hi all, On 10/12/21 6:43 PM, Pariksheet Nanda wrote: Error in `...`: internal logical NA value has been modified In the R source code, this error is in src/main/memory.c so I was thinking one way of investigating might be to run `R --debugger gdb`, then running R to load the symbols and either: 1) set a breakpoint for when it reaches that particular line in memory.c:R_gc_internal and then walk up the stack, 2) or set a watch point on memory.c:R_gc_internal:R_LogicalNAValue (somehow; having trouble getting gdb to reach that context). 3) Then I thought, maybe this is getting far into the weeds and instead I could check the most common C related error by enabling bounds checking of my C arrays per section 4.4 of the R-exts manual: $ R -q > options(CBoundsCheck = TRUE) > Sys.setenv(R_C_BOUNDS_CHECK = "yes") # Try both ways *shrug* > devtools::test() ... # All tests still pass. > devtools::check() ... # No change :( Maybe I'm not sure I'm using that option correctly? Or the option is ignored in devtools::check(). Or indeed, the error is not from over running C array boundaries. It turns out that using the precompiled debug symbols[1] isn't all that useful here because I don't get line numbers in gdb without the source files and many symbols are optimized out, so it looks like I would need to compile R from source with -ggdb first instead of using the Debian packages. Hopefully this is still the right approach? Pariksheet [1] After install r-base-core-dbg on Debian for the debug symbols. ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Strange "internal logical NA value has been modified" error
In addition to checking with Valgrind, the ASan/UBsan and rchk platforms on R-Hub (https://builder.r-hub.io/) can probably also be useful; > rhub::check(platform = "linux-x86_64-rocker-gcc-san") > rhub::check(platform = "ubuntu-rchk") /Henrik On Tue, Oct 12, 2021 at 4:54 PM Martin Morgan wrote: > > It is from base R > > > https://github.com/wch/r-source/blob/a984cc29b9b8d8821f8eb2a1081d9e0d1d4df56e/src/main/memory.c#L3214 > > and likely indicates memory corruption, not necessarily in the code that > triggers the error (this is when the garbage collector is triggered...). > Probably in *your* C code :) since it's the least tested. Probably writing > out of bounds. > > This could be quite tricky to debug. I'd try to get something close to a > minimal reproducible example. > > I'd try to take devtools out of the picture, maybe running the test/testhat.R > script from the command line using Rscript, or worst case creating a shell > package that adds minimal code and can be checked with R CMD build > --no-build-vignettes / R CMD check. > > You could try inserting gc() before / after the unit test; it might make it > clear that the unit test isn't the problem. You could also try > gctorture(TRUE); this will make your code run extremely painfully slowly, > which puts a big premium on having a minimal reproducible example; you could > put this near the code chunks that are causing problems. > > You might have success running under valgrind, something like R -d valgrind > -f minimal_script.R. > > Hope those suggestions help! > > Martin > > > On 10/12/21, 6:43 PM, "Bioc-devel on behalf of Pariksheet Nanda" > > wrote: > > Hi folks, > > I've been told to ask some of my more fun questions on this mailing list > instead of Slack. I'm climbing the ladder of submitting my first > Bioconductor package (https://gitlab.com/coregenomics/tsshmm) and feel > like there are gremlins that keep adding rungs to the top of the ladder. > The latest head scratcher from running devtools::check() is a unit > test for a trivial 2 line function failing with this gem of an error: > > > > test_check("tsshmm") > ══ Failed tests > > ── Error (test-tss.R:11:5): replace_unstranded splits unstranded into + > and - ── > Error in `tryCatchOne(expr, names, parentenv, handlers[[1L]])`: internal > logical NA value has been modified > Backtrace: > █ >1. ├─testthat::expect_equal(...) test-tss.R:11:4 >2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg = > "expected") >3. │ └─rlang::eval_bare(expr, quo_get_env(quo)) >4. └─GenomicRanges::GRanges(c("chr:100:+", "chr:100:-")) >5. └─methods::as(seqnames, "GRanges") >6. └─GenomicRanges:::asMethod(object) >7. └─GenomicRanges::GRanges(ans_seqnames, ans_ranges, ans_strand) >8. └─GenomicRanges:::new_GRanges(...) >9. └─S4Vectors:::normarg_mcols(mcols, Class, ans_len) > 10. └─S4Vectors::make_zero_col_DFrame(x_len) > 11. └─S4Vectors::new2("DFrame", nrows = nrow, check = > FALSE) > 12. └─methods::new(...) > 13. ├─methods::initialize(value, ...) > 14. └─methods::initialize(value, ...) > 15. └─methods::validObject(.Object) > 16. └─base::try(...) > 17. └─base::tryCatch(...) > 18. └─base:::tryCatchList(expr, classes, > parentenv, handlers) > 19. └─base:::tryCatchOne(expr, names, > parentenv, handlers[[1L]]) > [ FAIL 1 | WARN 0 | SKIP 0 | PASS 109 ] > > > The full continuous integration log is here: > https://gitlab.com/coregenomics/tsshmm/-/jobs/1673603868 > > The function in question is: > > > replace_unstranded <- function (gr) { > idx <- strand(gr) == "*" > if (length(idx) == 0L) > return(gr) > sort(c( > gr[! idx], > `strand<-`(gr[idx], value = "+"), > `strand<-`(gr[idx], value = "-"))) > } > > > Also online here: > > https://gitlab.com/coregenomics/tsshmm/-/blob/ef5e19a0e2f68fca93665bc417afbcfb6d437189/R/hmm.R#L170-178 > > ... and the unit test is: > > > test_that("replace_unstranded splits unstranded into + and -", { > expect_equal(replace_unstranded(GRanges("chr:100")), > GRanges(c("chr:100:+", "chr:100:-"))) > expect_equal(replace_unstranded(GRanges(c("chr:100", "chr:200:+"))), > sort(GRanges(c("chr:100:+", "chr:100:-", "chr:200:+" > }) > > > Also online here: > >
Re: [Bioc-devel] Strange "internal logical NA value has been modified" error
It is from base R https://github.com/wch/r-source/blob/a984cc29b9b8d8821f8eb2a1081d9e0d1d4df56e/src/main/memory.c#L3214 and likely indicates memory corruption, not necessarily in the code that triggers the error (this is when the garbage collector is triggered...). Probably in *your* C code :) since it's the least tested. Probably writing out of bounds. This could be quite tricky to debug. I'd try to get something close to a minimal reproducible example. I'd try to take devtools out of the picture, maybe running the test/testhat.R script from the command line using Rscript, or worst case creating a shell package that adds minimal code and can be checked with R CMD build --no-build-vignettes / R CMD check. You could try inserting gc() before / after the unit test; it might make it clear that the unit test isn't the problem. You could also try gctorture(TRUE); this will make your code run extremely painfully slowly, which puts a big premium on having a minimal reproducible example; you could put this near the code chunks that are causing problems. You might have success running under valgrind, something like R -d valgrind -f minimal_script.R. Hope those suggestions help! Martin On 10/12/21, 6:43 PM, "Bioc-devel on behalf of Pariksheet Nanda" wrote: Hi folks, I've been told to ask some of my more fun questions on this mailing list instead of Slack. I'm climbing the ladder of submitting my first Bioconductor package (https://gitlab.com/coregenomics/tsshmm) and feel like there are gremlins that keep adding rungs to the top of the ladder. The latest head scratcher from running devtools::check() is a unit test for a trivial 2 line function failing with this gem of an error: > test_check("tsshmm") ══ Failed tests ── Error (test-tss.R:11:5): replace_unstranded splits unstranded into + and - ── Error in `tryCatchOne(expr, names, parentenv, handlers[[1L]])`: internal logical NA value has been modified Backtrace: █ 1. ├─testthat::expect_equal(...) test-tss.R:11:4 2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg = "expected") 3. │ └─rlang::eval_bare(expr, quo_get_env(quo)) 4. └─GenomicRanges::GRanges(c("chr:100:+", "chr:100:-")) 5. └─methods::as(seqnames, "GRanges") 6. └─GenomicRanges:::asMethod(object) 7. └─GenomicRanges::GRanges(ans_seqnames, ans_ranges, ans_strand) 8. └─GenomicRanges:::new_GRanges(...) 9. └─S4Vectors:::normarg_mcols(mcols, Class, ans_len) 10. └─S4Vectors::make_zero_col_DFrame(x_len) 11. └─S4Vectors::new2("DFrame", nrows = nrow, check = FALSE) 12. └─methods::new(...) 13. ├─methods::initialize(value, ...) 14. └─methods::initialize(value, ...) 15. └─methods::validObject(.Object) 16. └─base::try(...) 17. └─base::tryCatch(...) 18. └─base:::tryCatchList(expr, classes, parentenv, handlers) 19. └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]]) [ FAIL 1 | WARN 0 | SKIP 0 | PASS 109 ] The full continuous integration log is here: https://gitlab.com/coregenomics/tsshmm/-/jobs/1673603868 The function in question is: replace_unstranded <- function (gr) { idx <- strand(gr) == "*" if (length(idx) == 0L) return(gr) sort(c( gr[! idx], `strand<-`(gr[idx], value = "+"), `strand<-`(gr[idx], value = "-"))) } Also online here: https://gitlab.com/coregenomics/tsshmm/-/blob/ef5e19a0e2f68fca93665bc417afbcfb6d437189/R/hmm.R#L170-178 ... and the unit test is: test_that("replace_unstranded splits unstranded into + and -", { expect_equal(replace_unstranded(GRanges("chr:100")), GRanges(c("chr:100:+", "chr:100:-"))) expect_equal(replace_unstranded(GRanges(c("chr:100", "chr:200:+"))), sort(GRanges(c("chr:100:+", "chr:100:-", "chr:200:+" }) Also online here: https://gitlab.com/coregenomics/tsshmm/-/blob/ef5e19a0e2f68fca93665bc417afbcfb6d437189/tests/testthat/test-tss.R#L11-L12 What's interesting is this is *not* reproducible by running devtools::test() but only devtools::check() so as far as I know there isn't a way to interactively debug this while devtools::check() is going on? Every few days I've seen on that "internal ... value has been modified" which prevents me from running nearly any R commands. Originally I would restart R, but then I found I could clear that error by running gc(). No idea what causes it. Maybe
[Bioc-devel] Strange "internal logical NA value has been modified" error
Hi folks, I've been told to ask some of my more fun questions on this mailing list instead of Slack. I'm climbing the ladder of submitting my first Bioconductor package (https://gitlab.com/coregenomics/tsshmm) and feel like there are gremlins that keep adding rungs to the top of the ladder. The latest head scratcher from running devtools::check() is a unit test for a trivial 2 line function failing with this gem of an error: > test_check("tsshmm") ══ Failed tests ── Error (test-tss.R:11:5): replace_unstranded splits unstranded into + and - ── Error in `tryCatchOne(expr, names, parentenv, handlers[[1L]])`: internal logical NA value has been modified Backtrace: █ 1. ├─testthat::expect_equal(...) test-tss.R:11:4 2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg = "expected") 3. │ └─rlang::eval_bare(expr, quo_get_env(quo)) 4. └─GenomicRanges::GRanges(c("chr:100:+", "chr:100:-")) 5. └─methods::as(seqnames, "GRanges") 6. └─GenomicRanges:::asMethod(object) 7. └─GenomicRanges::GRanges(ans_seqnames, ans_ranges, ans_strand) 8. └─GenomicRanges:::new_GRanges(...) 9. └─S4Vectors:::normarg_mcols(mcols, Class, ans_len) 10. └─S4Vectors::make_zero_col_DFrame(x_len) 11. └─S4Vectors::new2("DFrame", nrows = nrow, check = FALSE) 12. └─methods::new(...) 13. ├─methods::initialize(value, ...) 14. └─methods::initialize(value, ...) 15. └─methods::validObject(.Object) 16. └─base::try(...) 17. └─base::tryCatch(...) 18. └─base:::tryCatchList(expr, classes, parentenv, handlers) 19. └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]]) [ FAIL 1 | WARN 0 | SKIP 0 | PASS 109 ] The full continuous integration log is here: https://gitlab.com/coregenomics/tsshmm/-/jobs/1673603868 The function in question is: replace_unstranded <- function (gr) { idx <- strand(gr) == "*" if (length(idx) == 0L) return(gr) sort(c( gr[! idx], `strand<-`(gr[idx], value = "+"), `strand<-`(gr[idx], value = "-"))) } Also online here: https://gitlab.com/coregenomics/tsshmm/-/blob/ef5e19a0e2f68fca93665bc417afbcfb6d437189/R/hmm.R#L170-178 ... and the unit test is: test_that("replace_unstranded splits unstranded into + and -", { expect_equal(replace_unstranded(GRanges("chr:100")), GRanges(c("chr:100:+", "chr:100:-"))) expect_equal(replace_unstranded(GRanges(c("chr:100", "chr:200:+"))), sort(GRanges(c("chr:100:+", "chr:100:-", "chr:200:+" }) Also online here: https://gitlab.com/coregenomics/tsshmm/-/blob/ef5e19a0e2f68fca93665bc417afbcfb6d437189/tests/testthat/test-tss.R#L11-L12 What's interesting is this is *not* reproducible by running devtools::test() but only devtools::check() so as far as I know there isn't a way to interactively debug this while devtools::check() is going on? Every few days I've seen on that "internal ... value has been modified" which prevents me from running nearly any R commands. Originally I would restart R, but then I found I could clear that error by running gc(). No idea what causes it. Maybe some S4 magic? Yes, I have downloaded the mailing lists for bioc-devel, r-devel, r-help, and r-package-devel and see no mention of "value has been modified" [1]. Any help appreciated. Pariksheet [1] Mailing lists downloader: #!/bin/bash -x for url in https://stat.ethz.ch/pipermail/{bioc-devel,r-{devel,help,package-devel}}/ do dir=$(basename $url) wget \ --timestamping \ --no-remove-listing \ --recursive \ --level 1 \ --no-directories \ --no-host-directories \ --cut-dirs 2 \ --directory-prefix "$dir" \ --accept '*.txt.gz' \ --relative \ --no-parent \ $url done ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel