Re: [Rd] [Bioc-devel] For integer vectors, `as(x, "numeric")` has no effect.

2015-12-26 Thread John Chambers
Re: coerce() methods.

Important to realize that as() does not call selectMethod() in the standard 
way, but restricts inheritance to the first argument:
   asMethod <- selectMethod("coerce", sig, optional = TRUE, 
  c(from = TRUE, to = FALSE), fdef = coerceFun, 
A valid comparison would have to take account of this.

Once the method has been _correctly_ selected, it is stored in the internal 
table and therefore  will be returned by .findMethodInTable without repeating a 
search.

John

On Dec 25, 2015, at 11:51 PM, Herv� Pag�s  wrote:

> Or maybe the "right" concept is that "numeric" is a virtual class
> with 3 subclasses: "complex", "double", and "integer". Anyway it's
> probably too late for implementing the "right" concept so it doesn't
> really matter.
> 
> Thanks Martin for offering to fix the as(1L, "numeric") bug. Very
> much appreciated. I guess that means fixing the class(x) <- "numeric"
> bug on integer vectors:
> 
>  > x <- 1L
>  > class(x) <- "numeric"
>  > class(x)
>  [1] "integer"
> 
> My wish for 2016: that selectMethod() always tells the truth. For
> example selectMethod("coerce", c("integer", "numeric")) doesn't
> in a fresh session, only after you call as(1L, "numeric")). Full
> story here:
> 
>  https://stat.ethz.ch/pipermail/r-devel/2010-April/057098.html
> 
> Thanks,
> H.
> 
> 
> On 12/19/2015 10:09 AM, John Chambers wrote:
>> As I tried to say on Dec. 11, there are two levels of "fix":
>> 
>> 1.  The fix to the complaint in the OP's subject heading is to conform to 
>> the default third argument, strict=TRUE: as(1L, "numeric") == 1.0
>> 
>> This generates some incompatibilities, as for classes that extend "numeric". 
>> But still leaves class(1.0) "numeric" and typeof(1.0) "double".
>> 
>> The workaround for class definitions that really need NOT to coerce integers 
>> to double is to define a class union, say
>>   setClassUnion("Number", c("numeric", "integer"))
>> and use that for the slot.
>> 
>> 2.  The "right" concept is arguably that "numeric" is a virtual class with 
>> two subclasses, "double" and "integer".  Given a time machine back to < 
>> 1998, that would be my choice.  But already in the 1998 S4 book, "numeric" 
>> was equated with "double".
>> 
>> so, there it is, IMO.  This is what you get with a successful open-source 
>> language:  Much hassle to do the "right thing" after the fact and the more 
>> change, the more hassle.
>> 
>> Fix 1. seems to me an actual bug fix, so my inclination would be to go with 
>> that (on r-devel), advertising that it may change the effective definition 
>> of some classes.
>> 
>> But I can sympathize with choosing 1, 2 or neither.
>> 
>> John
>> 
>> PS:  Until Jan. 4, I may be even poorer at replying than usual, while 
>> getting the current book off to the publisher.
>> 
>> On Dec 19, 2015, at 3:32 AM, Martin Maechler  
>> wrote:
>> 
 Martin Maechler 
on Sat, 12 Dec 2015 10:32:51 +0100 writes:
>>> 
 John Chambers 
on Fri, 11 Dec 2015 10:11:05 -0800 writes:
>>> 
> Somehow, the most obvious fixes are always back-incompatible these days.
> The example intrigued me, so I looked into it a bit (should have been 
> doing something else, but )
>>> 
> You're right that this is the proverbial thin-edge-of-the-wedge.
>>> 
> The problem is in setDataPart(), which will be called whenever a class 
> extends one of the vector types.
>>> 
> It does
> as(value, dataClass)
> The key point is that the third argument to as(), strict=TRUE by default. 
>  So, yes, the change will cause all integer vectors to become double when 
> the class extends "numeric".  Generally, strict=TRUE makes sense here and 
> of course changing THAT would open up yet more incompatibilities.
>>> 
> For back compatibility, one would have to have some special code in 
> setDataPart() for the case of integer/numeric.
>>> 
> John
>>> 
> (Historically, the original sin was probably not making a distinction 
> between "numeric" as a virtual class and "double" as a type/class.)
>>> 
 Yes, indeed.  In the mean time, I've seen more cases where
 "the change will cause all integer vectors to become double when the class 
  extends "numeric".
 seems detrimental.
>>> 
 OTOH, I still think we could go in the right direction ---
 hopefully along the wishes of bioconductor S4 development, see
 Martin Morgan's e-mail:
>>> 
 [This is all S4 - only; should not much affect base R / S3]
 Currently,   "integer" is a subclass of "numeric"  and so the
 "integer become double" part seems unwanted to me.
 OTOH,  it would really make sense to more formally
 have the basic subclasses of  "numeric" to be "integer" and "double",
 and  to let  as(*, "double") to become different to as(*, "numeric")
 [Again, this is 

Re: [Rd] [Bioc-devel] For integer vectors, `as(x, "numeric")` has no effect.

2015-12-25 Thread Hervé Pagès

Or maybe the "right" concept is that "numeric" is a virtual class
with 3 subclasses: "complex", "double", and "integer". Anyway it's
probably too late for implementing the "right" concept so it doesn't
really matter.

Thanks Martin for offering to fix the as(1L, "numeric") bug. Very
much appreciated. I guess that means fixing the class(x) <- "numeric"
bug on integer vectors:

  > x <- 1L
  > class(x) <- "numeric"
  > class(x)
  [1] "integer"

My wish for 2016: that selectMethod() always tells the truth. For
example selectMethod("coerce", c("integer", "numeric")) doesn't
in a fresh session, only after you call as(1L, "numeric")). Full
story here:

  https://stat.ethz.ch/pipermail/r-devel/2010-April/057098.html

Thanks,
H.


On 12/19/2015 10:09 AM, John Chambers wrote:

As I tried to say on Dec. 11, there are two levels of "fix":

1.  The fix to the complaint in the OP's subject heading is to conform to the default 
third argument, strict=TRUE: as(1L, "numeric") == 1.0

This generates some incompatibilities, as for classes that extend "numeric". But still leaves 
class(1.0) "numeric" and typeof(1.0) "double".

The workaround for class definitions that really need NOT to coerce integers to 
double is to define a class union, say
   setClassUnion("Number", c("numeric", "integer"))
and use that for the slot.

2.  The "right" concept is arguably that "numeric" is a virtual class with two subclasses, "double" and 
"integer".  Given a time machine back to < 1998, that would be my choice.  But already in the 1998 S4 book, "numeric" 
was equated with "double".

so, there it is, IMO.  This is what you get with a successful open-source language:  Much 
hassle to do the "right thing" after the fact and the more change, the more 
hassle.

Fix 1. seems to me an actual bug fix, so my inclination would be to go with 
that (on r-devel), advertising that it may change the effective definition of 
some classes.

But I can sympathize with choosing 1, 2 or neither.

John

PS:  Until Jan. 4, I may be even poorer at replying than usual, while getting 
the current book off to the publisher.

On Dec 19, 2015, at 3:32 AM, Martin Maechler  wrote:


Martin Maechler 
on Sat, 12 Dec 2015 10:32:51 +0100 writes:



John Chambers 
on Fri, 11 Dec 2015 10:11:05 -0800 writes:



Somehow, the most obvious fixes are always back-incompatible these days.
The example intrigued me, so I looked into it a bit (should have been doing 
something else, but )



You're right that this is the proverbial thin-edge-of-the-wedge.



The problem is in setDataPart(), which will be called whenever a class extends 
one of the vector types.



It does
as(value, dataClass)
The key point is that the third argument to as(), strict=TRUE by default.  So, yes, the 
change will cause all integer vectors to become double when the class extends 
"numeric".  Generally, strict=TRUE makes sense here and of course changing THAT 
would open up yet more incompatibilities.



For back compatibility, one would have to have some special code in 
setDataPart() for the case of integer/numeric.



John



(Historically, the original sin was probably not making a distinction between "numeric" 
as a virtual class and "double" as a type/class.)



Yes, indeed.  In the mean time, I've seen more cases where
"the change will cause all integer vectors to become double when the class  extends 
"numeric".
seems detrimental.



OTOH, I still think we could go in the right direction ---
hopefully along the wishes of bioconductor S4 development, see
Martin Morgan's e-mail:



[This is all S4 - only; should not much affect base R / S3]
Currently,   "integer" is a subclass of "numeric"  and so the
"integer become double" part seems unwanted to me.
OTOH,  it would really make sense to more formally
have the basic subclasses of  "numeric" to be "integer" and "double",
and  to let  as(*, "double") to become different to as(*, "numeric")
[Again, this is just for the S4 classes and as() coercions, *not* e.g.
for as.numeric() / as.double() !]



In the DEPRECATED part of the NEWS for R 2.7.0 (April 2008) we
have had



o   The S4 pseudo-classes "single" and double have been removed.
(The S4 class for a REALSXP is "numeric": for back-compatibility
as(x, "double") coerces to "numeric".)



I think the removal of "single" was fine, but in hindsight,
maybe the removal of "double" -- which was partly broken then --
possibly could rather have been a fixup of "double" along the
following



Current "thought experiment proposal" :



1) "numeric" := {"integer", "double"}   { class - subclasses }
2) as(1L, "numeric")  continues to return 1L .. since integer is
one case of "numeric"
3) as(1L, "double")  newly returns 1.0   {and in fact would be
"equivalent" to   as.double(1L)}



After the above change,  S4  as(*, "double") would correspond to S3 as.double
but  as(*, "numeric")  would continue to differ from