Re: [Rd] [EXTERNAL] Re: Re: all.equal failure
On 05/04/2019 11:33 a.m., Martin Maechler wrote: Duncan Murdoch on Fri, 5 Apr 2019 11:12:48 -0400 writes: > On 05/04/2019 10:46 a.m., Therneau, Terry M., Ph.D. wrote: >> >> >> On 4/5/19 9:39 AM, Duncan Murdoch wrote: >>> On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote: Duncan, I should have included it in my original note, but all.equal(unclass(t0x), unclass(t1x)) returns TRUE as well. I had tried that as well. But a further look at all.equal.default shows the following line right near the top: if (is.language(target) || is.function(target)) return(all.equal.language(target, current, ...)) and that path explicitly ignores attributes. >>> >>> Which R version are you using? I see deparse(target) and deparse(current) in >>> all.equal.language(), and those should not be ignoring attributes according to the >>> documentation. But the problem is that indeed "of course" all.equal.formula() and not all.equal.language() is called for the terms since as you yourself remarked, their class is c("terms", "formula"), and so what Terry reported is indeed correct *and* a bug and in "all versions" of R (I did not look far back, but these things haven't changed much). The cleanest would probably be to define an all.equal.terms() method, as I think there may be more code relying on the behavior of all.equal.formula() to only look at the formulas themselves and not their attributes... but you (Duncan) and others may have a different opinion. I don't know if that would be easy -- it seems to me there is a bug in deparse(), which won't show attributes on language objects even if you ask it to: # This is fine: deparse(structure(1, attrib=2)) # [1] "structure(1, attrib = 2)" # This doesn't show the attributes deparse(structure(quote(f(1)), attrib=2)) # [1] "f(1)" But as you mention, if this isn't a new bug fixing it will likely cause problems for people who assume it is intentional... Duncan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [EXTERNAL] Re: Re: all.equal failure
> Martin Maechler > on Fri, 5 Apr 2019 17:33:54 +0200 writes: > Duncan Murdoch > on Fri, 5 Apr 2019 11:12:48 -0400 writes: >> On 05/04/2019 10:46 a.m., Therneau, Terry M., Ph.D. wrote: >>> >>> >>> On 4/5/19 9:39 AM, Duncan Murdoch wrote: On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote: > Duncan, > I should have included it in my original note, but > > all.equal(unclass(t0x), unclass(t1x)) > > returns TRUE as well. I had tried that as well. But a further look at > all.equal.default shows the following line right near the top: > if (is.language(target) || is.function(target)) > return(all.equal.language(target, current, ...)) > > and that path explicitly ignores attributes. Which R version are you using? I see deparse(target) and deparse(current) in all.equal.language(), and those should not be ignoring attributes according to the documentation. > But the problem is that indeed "of course" all.equal.formula() > and not all.equal.language() is called for the terms since as > you yourself remarked, their class is c("terms", "formula"), > and so what Terry reported is indeed correct *and* a bug > and in "all versions" of R (I did not look far back, but these things > haven't changed much). > The cleanest would probably be to define an all.equal.terms() > method, as I think there may be more code relying on the > behavior of all.equal.formula() to only look at the formulas > themselves and not their attributes... > but you (Duncan) and others may have a different opinion. and I do agree with Duncan even more now that indeed it's very unsatisfactory that deparse() {and dput(), dump() ..} of a terms object would only reproduce the formula and nothing else; and yes that's all in the C code: --> src/main/deparse.c --> in function deparse2buff() --> inside the (350 lines large) 'case LANGSXP'. Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [EXTERNAL] Re: Re: all.equal failure
> Duncan Murdoch > on Fri, 5 Apr 2019 11:12:48 -0400 writes: > On 05/04/2019 10:46 a.m., Therneau, Terry M., Ph.D. wrote: >> >> >> On 4/5/19 9:39 AM, Duncan Murdoch wrote: >>> On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote: Duncan, I should have included it in my original note, but all.equal(unclass(t0x), unclass(t1x)) returns TRUE as well. I had tried that as well. But a further look at all.equal.default shows the following line right near the top: if (is.language(target) || is.function(target)) return(all.equal.language(target, current, ...)) and that path explicitly ignores attributes. >>> >>> Which R version are you using? I see deparse(target) and deparse(current) in >>> all.equal.language(), and those should not be ignoring attributes according to the >>> documentation. But the problem is that indeed "of course" all.equal.formula() and not all.equal.language() is called for the terms since as you yourself remarked, their class is c("terms", "formula"), and so what Terry reported is indeed correct *and* a bug and in "all versions" of R (I did not look far back, but these things haven't changed much). The cleanest would probably be to define an all.equal.terms() method, as I think there may be more code relying on the behavior of all.equal.formula() to only look at the formulas themselves and not their attributes... but you (Duncan) and others may have a different opinion. Martin Maechler ETH Zurich and R Core Team __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [EXTERNAL] Re: Re: all.equal failure
On 05/04/2019 10:46 a.m., Therneau, Terry M., Ph.D. wrote: On 4/5/19 9:39 AM, Duncan Murdoch wrote: On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote: Duncan, I should have included it in my original note, but all.equal(unclass(t0x), unclass(t1x)) returns TRUE as well. I had tried that as well. But a further look at all.equal.default shows the following line right near the top: if (is.language(target) || is.function(target)) return(all.equal.language(target, current, ...)) and that path explicitly ignores attributes. Which R version are you using? I see deparse(target) and deparse(current) in all.equal.language(), and those should not be ignoring attributes according to the documentation. I'm using today's version of R-devel on Ubuntu. (svn up this AM) But I agree, both target and current appear. That's not what I said. I said that the attributes should not be ignored in that function. I don't see anything in the R-devel version of it that ignores attributes: > all.equal.language function (target, current, ...) { mt <- mode(target) mc <- mode(current) if (mt == "expression" && mc == "expression") return(all.equal.list(target, current, ...)) ttxt <- paste(deparse(target), collapse = "\n") ctxt <- paste(deparse(current), collapse = "\n") msg <- c(if (mt != mc) paste0("Modes of target, current: ", mt, ", ", mc), if (ttxt != ctxt) { if (pmatch(ttxt, ctxt, 0L)) "target is a subset of current" else if (pmatch(ctxt, ttxt, 0L)) "current is a subset of target" else "target, current do not match when deparsed" }) if (is.null(msg)) TRUE else msg } Duncan Murdoch Duncan Murdoch I'll change my original original title to "all.equal was not a good tool for testing certain code issues". Thanks for the pointer, Terry On 4/5/19 9:00 AM, Duncan Murdoch wrote: On 05/04/2019 9:03 a.m., Therneau, Terry M., Ph.D. via R-devel wrote: This arose in testing [.terms and has me confused. data(esoph) # use a standard data set t0x <- terms(model.frame( ~ tobgp, data=esoph)) t1 <- terms(model.frame(ncases ~ agegp + tobgp, data=esoph)) t1x <- (delete.response(t1))[-1] > all.equal(t0x, t1x) [1] TRUE # the above is wrong, because they actually are not the same > all.equal(attr(t0x, 'dataClasses'), attr(t1x, 'dataClasses')) [1] "Names: 1 string mismatch" [2] "Lengths (1, 2) differ (string compare on first 1)" As documented, all.equal() is generic, with methods for different classes. The classes of both t0x and t1x are c("terms","formula") with no all.equal.terms method, so all.equal.formula is called. That method isn't specifically documented, but you can see its definition as function (target, current, ...) { if (length(target) != length(current)) return(paste0("target, current differ in having response: ", length(target) == 3L, ", ", length(current) == 3L)) if (!identical(deparse(target), deparse(current))) "formulas differ in contents" else TRUE } So the issue is that deparse(t0x) and deparse(t1x) give the same strings with no attributes shown, even though "showAttributes" is set by default. I haven't traced through the C code to see where things are going wrong. Duncan Murdoch > sessionInfo() R Under development (unstable) (2019-04-05 r76323) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.2 LTS Matrix products: default BLAS: /usr/local/src/R-devel/lib/libRblas.so LAPACK: /usr/local/src/R-devel/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.7.0 tools_3.7.0 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [EXTERNAL] Re: Re: all.equal failure
On 4/5/19 9:39 AM, Duncan Murdoch wrote: On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote: Duncan, I should have included it in my original note, but all.equal(unclass(t0x), unclass(t1x)) returns TRUE as well. I had tried that as well. But a further look at all.equal.default shows the following line right near the top: if (is.language(target) || is.function(target)) return(all.equal.language(target, current, ...)) and that path explicitly ignores attributes. Which R version are you using? I see deparse(target) and deparse(current) in all.equal.language(), and those should not be ignoring attributes according to the documentation. I'm using today's version of R-devel on Ubuntu. (svn up this AM) But I agree, both target and current appear. Duncan Murdoch I'll change my original original title to "all.equal was not a good tool for testing certain code issues". Thanks for the pointer, Terry On 4/5/19 9:00 AM, Duncan Murdoch wrote: On 05/04/2019 9:03 a.m., Therneau, Terry M., Ph.D. via R-devel wrote: This arose in testing [.terms and has me confused. data(esoph) # use a standard data set t0x <- terms(model.frame( ~ tobgp, data=esoph)) t1 <- terms(model.frame(ncases ~ agegp + tobgp, data=esoph)) t1x <- (delete.response(t1))[-1] > all.equal(t0x, t1x) [1] TRUE # the above is wrong, because they actually are not the same > all.equal(attr(t0x, 'dataClasses'), attr(t1x, 'dataClasses')) [1] "Names: 1 string mismatch" [2] "Lengths (1, 2) differ (string compare on first 1)" As documented, all.equal() is generic, with methods for different classes. The classes of both t0x and t1x are c("terms","formula") with no all.equal.terms method, so all.equal.formula is called. That method isn't specifically documented, but you can see its definition as function (target, current, ...) { if (length(target) != length(current)) return(paste0("target, current differ in having response: ", length(target) == 3L, ", ", length(current) == 3L)) if (!identical(deparse(target), deparse(current))) "formulas differ in contents" else TRUE } So the issue is that deparse(t0x) and deparse(t1x) give the same strings with no attributes shown, even though "showAttributes" is set by default. I haven't traced through the C code to see where things are going wrong. Duncan Murdoch > sessionInfo() R Under development (unstable) (2019-04-05 r76323) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.2 LTS Matrix products: default BLAS: /usr/local/src/R-devel/lib/libRblas.so LAPACK: /usr/local/src/R-devel/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.7.0 tools_3.7.0 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel