Re: [Rd] [EXTERNAL] Re: Re: all.equal failure

2019-04-05 Thread Duncan Murdoch

On 05/04/2019 11:33 a.m., Martin Maechler wrote:

Duncan Murdoch
 on Fri, 5 Apr 2019 11:12:48 -0400 writes:


 > On 05/04/2019 10:46 a.m., Therneau, Terry M., Ph.D. wrote:
 >>
 >>
 >> On 4/5/19 9:39 AM, Duncan Murdoch wrote:
 >>> On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote:
  Duncan,
     I should have included it in my original note, but
 
        all.equal(unclass(t0x), unclass(t1x))
 
  returns TRUE as well.  I had tried that as well.   But a further look 
at
  all.equal.default shows the following line right near the top:
   if (is.language(target) || is.function(target))
   return(all.equal.language(target, current, ...))
 
  and that path explicitly ignores attributes.
 >>>
 >>> Which R version are you using?  I see deparse(target) and 
deparse(current) in
 >>> all.equal.language(), and those should not be ignoring attributes 
according to the
 >>> documentation.

But the problem is that indeed  "of course"  all.equal.formula()
and not all.equal.language() is called for the terms since as
you yourself remarked, their class is  c("terms", "formula"),

and so what Terry reported is indeed correct *and* a bug
and in "all versions" of R (I did not look far back, but these things
haven't changed much).

The cleanest would probably be to define an  all.equal.terms()
method, as I think there may be more code relying on the
behavior of  all.equal.formula() to only look at the formulas
themselves and not their attributes...
but you (Duncan) and others may have a different opinion.


I don't know if that would be easy -- it seems to me there is a bug in 
deparse(), which won't show attributes on language objects even if you 
ask it to:


# This is fine:
deparse(structure(1, attrib=2))
# [1] "structure(1, attrib = 2)"

# This doesn't show the attributes
deparse(structure(quote(f(1)), attrib=2))
# [1] "f(1)"

But as you mention, if this isn't a new bug fixing it will likely cause 
problems for people who assume it is intentional...


Duncan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [EXTERNAL] Re: Re: all.equal failure

2019-04-05 Thread Martin Maechler
> Martin Maechler 
> on Fri, 5 Apr 2019 17:33:54 +0200 writes:

> Duncan Murdoch 
> on Fri, 5 Apr 2019 11:12:48 -0400 writes:

>> On 05/04/2019 10:46 a.m., Therneau, Terry M., Ph.D. wrote:
>>> 
>>> 
>>> On 4/5/19 9:39 AM, Duncan Murdoch wrote:
 On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote:
> Duncan,
>    I should have included it in my original note, but
> 
>       all.equal(unclass(t0x), unclass(t1x))
> 
> returns TRUE as well.  I had tried that as well.   But a further look 
at
> all.equal.default shows the following line right near the top:
>  if (is.language(target) || is.function(target))
>  return(all.equal.language(target, current, ...))
> 
> and that path explicitly ignores attributes.
 
 Which R version are you using?  I see deparse(target) and 
deparse(current) in
 all.equal.language(), and those should not be ignoring attributes 
according to the
 documentation.

> But the problem is that indeed  "of course"  all.equal.formula()
> and not all.equal.language() is called for the terms since as
> you yourself remarked, their class is  c("terms", "formula"),

> and so what Terry reported is indeed correct *and* a bug
> and in "all versions" of R (I did not look far back, but these things
> haven't changed much).

> The cleanest would probably be to define an  all.equal.terms()
> method, as I think there may be more code relying on the
> behavior of  all.equal.formula() to only look at the formulas
> themselves and not their attributes...
> but you (Duncan) and others may have a different opinion.

and I do agree with Duncan even more now that indeed it's very
unsatisfactory that deparse() {and dput(), dump() ..} of a terms
object would only reproduce the formula and nothing else;
and yes that's all in the C code:
 --> src/main/deparse.c
--> in function deparse2buff()
   -->  inside the (350 lines large)  'case LANGSXP'.

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [EXTERNAL] Re: Re: all.equal failure

2019-04-05 Thread Martin Maechler
> Duncan Murdoch 
> on Fri, 5 Apr 2019 11:12:48 -0400 writes:

> On 05/04/2019 10:46 a.m., Therneau, Terry M., Ph.D. wrote:
>> 
>> 
>> On 4/5/19 9:39 AM, Duncan Murdoch wrote:
>>> On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote:
 Duncan,
    I should have included it in my original note, but
 
       all.equal(unclass(t0x), unclass(t1x))
 
 returns TRUE as well.  I had tried that as well.   But a further look 
at
 all.equal.default shows the following line right near the top:
  if (is.language(target) || is.function(target))
  return(all.equal.language(target, current, ...))
 
 and that path explicitly ignores attributes.
>>> 
>>> Which R version are you using?  I see deparse(target) and 
deparse(current) in
>>> all.equal.language(), and those should not be ignoring attributes 
according to the
>>> documentation.

But the problem is that indeed  "of course"  all.equal.formula()
and not all.equal.language() is called for the terms since as
you yourself remarked, their class is  c("terms", "formula"),

and so what Terry reported is indeed correct *and* a bug
and in "all versions" of R (I did not look far back, but these things
haven't changed much).

The cleanest would probably be to define an  all.equal.terms()
method, as I think there may be more code relying on the
behavior of  all.equal.formula() to only look at the formulas
themselves and not their attributes...
but you (Duncan) and others may have a different opinion.

Martin Maechler
ETH Zurich and R Core Team

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [EXTERNAL] Re: Re: all.equal failure

2019-04-05 Thread Duncan Murdoch

On 05/04/2019 10:46 a.m., Therneau, Terry M., Ph.D. wrote:



On 4/5/19 9:39 AM, Duncan Murdoch wrote:

On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote:

Duncan,
    I should have included it in my original note, but

       all.equal(unclass(t0x), unclass(t1x))

returns TRUE as well.  I had tried that as well.   But a further look at
all.equal.default shows the following line right near the top:
  if (is.language(target) || is.function(target))
  return(all.equal.language(target, current, ...))

and that path explicitly ignores attributes.


Which R version are you using?  I see deparse(target) and deparse(current) in
all.equal.language(), and those should not be ignoring attributes according to 
the
documentation.


I'm using today's version of R-devel on Ubuntu.  (svn up this AM)
But I agree, both target and current appear.


That's not what I said.  I said that the attributes should not be 
ignored in that function.  I don't see anything in the R-devel version 
of it that ignores attributes:


> all.equal.language
function (target, current, ...)
{
mt <- mode(target)
mc <- mode(current)
if (mt == "expression" && mc == "expression")
return(all.equal.list(target, current, ...))
ttxt <- paste(deparse(target), collapse = "\n")
ctxt <- paste(deparse(current), collapse = "\n")
msg <- c(if (mt != mc) paste0("Modes of target, current: ",
mt, ", ", mc), if (ttxt != ctxt) {
if (pmatch(ttxt, ctxt, 0L)) "target is a subset of current" 
else if (pmatch(ctxt,
ttxt, 0L)) "current is a subset of target" else "target, 
current do not match when deparsed"

})
if (is.null(msg))
TRUE
else msg
}




Duncan Murdoch






Duncan Murdoch



I'll change my original original title to "all.equal was not a good tool for 
testing
certain code issues".

Thanks for the pointer,

Terry



On 4/5/19 9:00 AM, Duncan Murdoch wrote:

On 05/04/2019 9:03 a.m., Therneau, Terry M., Ph.D. via R-devel wrote:

This arose in testing [.terms and has me confused.

data(esoph)   # use a standard data set

t0x <- terms(model.frame( ~ tobgp, data=esoph))
t1 <-  terms(model.frame(ncases ~ agegp + tobgp, data=esoph))
t1x <- (delete.response(t1))[-1]

   > all.equal(t0x, t1x)
[1] TRUE

# the above is wrong, because they actually are not the same

   > all.equal(attr(t0x, 'dataClasses'), attr(t1x, 'dataClasses'))
[1] "Names: 1 string mismatch"
[2] "Lengths (1, 2) differ (string compare on first 1)"


As documented, all.equal() is generic, with methods for different classes.  The
classes of both t0x and t1x are

  c("terms","formula")

with no all.equal.terms method, so all.equal.formula is called. That method 
isn't
specifically documented, but you can see its definition as

function (target, current, ...)
{
     if (length(target) != length(current))
     return(paste0("target, current differ in having response: ",
     length(target) == 3L, ", ", length(current) == 3L))
     if (!identical(deparse(target), deparse(current)))
     "formulas differ in contents"
     else TRUE
}

So the issue is that deparse(t0x) and deparse(t1x) give the same strings with no
attributes shown, even though "showAttributes" is set by default.   I haven't 
traced
through the C code to see where things are going wrong.

Duncan Murdoch



   > sessionInfo()
R Under development (unstable) (2019-04-05 r76323)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS:   /usr/local/src/R-devel/lib/libRblas.so
LAPACK: /usr/local/src/R-devel/lib/libRlapack.so

locale:
    [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
    [3] LC_TIME=en_US.UTF-8    LC_COLLATE=C
    [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
    [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
    [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.7.0 tools_3.7.0


 [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel











__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [EXTERNAL] Re: Re: all.equal failure

2019-04-05 Thread Therneau, Terry M., Ph.D. via R-devel




On 4/5/19 9:39 AM, Duncan Murdoch wrote:

On 05/04/2019 10:19 a.m., Therneau, Terry M., Ph.D. wrote:

Duncan,
   I should have included it in my original note, but

      all.equal(unclass(t0x), unclass(t1x))

returns TRUE as well.  I had tried that as well.   But a further look at 
all.equal.default shows the following line right near the top:

 if (is.language(target) || is.function(target))
 return(all.equal.language(target, current, ...))

and that path explicitly ignores attributes.


Which R version are you using?  I see deparse(target) and deparse(current) in 
all.equal.language(), and those should not be ignoring attributes according to the 
documentation.



I'm using today's version of R-devel on Ubuntu.  (svn up this AM)
But I agree, both target and current appear.


Duncan Murdoch



I'll change my original original title to "all.equal was not a good tool for testing 
certain code issues".


Thanks for the pointer,

Terry



On 4/5/19 9:00 AM, Duncan Murdoch wrote:

On 05/04/2019 9:03 a.m., Therneau, Terry M., Ph.D. via R-devel wrote:

This arose in testing [.terms and has me confused.

data(esoph)   # use a standard data set

t0x <- terms(model.frame( ~ tobgp, data=esoph))
t1 <-  terms(model.frame(ncases ~ agegp + tobgp, data=esoph))
t1x <- (delete.response(t1))[-1]

  > all.equal(t0x, t1x)
[1] TRUE

# the above is wrong, because they actually are not the same

  > all.equal(attr(t0x, 'dataClasses'), attr(t1x, 'dataClasses'))
[1] "Names: 1 string mismatch"
[2] "Lengths (1, 2) differ (string compare on first 1)"


As documented, all.equal() is generic, with methods for different classes.  The 
classes of both t0x and t1x are


 c("terms","formula")

with no all.equal.terms method, so all.equal.formula is called. That method isn't 
specifically documented, but you can see its definition as


function (target, current, ...)
{
    if (length(target) != length(current))
    return(paste0("target, current differ in having response: ",
    length(target) == 3L, ", ", length(current) == 3L))
    if (!identical(deparse(target), deparse(current)))
    "formulas differ in contents"
    else TRUE
}

So the issue is that deparse(t0x) and deparse(t1x) give the same strings with no 
attributes shown, even though "showAttributes" is set by default.   I haven't traced 
through the C code to see where things are going wrong.


Duncan Murdoch



  > sessionInfo()
R Under development (unstable) (2019-04-05 r76323)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS:   /usr/local/src/R-devel/lib/libRblas.so
LAPACK: /usr/local/src/R-devel/lib/libRlapack.so

locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8    LC_COLLATE=C
   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
   [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.7.0 tools_3.7.0


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel









__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel