Re: [Rd] [R] NaN, Inf to NA

Duncan Murdoch Fri, 27 May 2011 13:33:32 -0700

Okay, I've now committed these changes to R-devel. We'll see over thenext few days if CRAN packages were making use of the oldpermissiveness, and if necessary modify or revert the change.


Duncan Murdoch


On 27/05/2011 11:52 AM, Marc Schwartz wrote:

On May 27, 2011, at 10:33 AM, Duncan Murdoch wrote:

>  On 27/05/2011 11:11 AM, Martin Maechler wrote:
>>  >>>>>   Duncan Murdoch<murdoch.dun...@gmail.com>
>>  >>>>>       on Fri, 27 May 2011 08:23:14 -0400 writes:
>>
>>      >   On 11-05-27 4:27 AM, Albert-Jan Roskam wrote:
>>      >>   Aha! Thank you very much for that clarification! It would
>>      >>   be much more user friendly if R generated a
>>      >>   NotImplementedError or something similar. The 'garbage
>>      >>   results' are pretty misleading, esp. to a novice.
>>
>>      >   I think that's a good idea.  The default methods are
>>      >   documented to work on atomic vectors; dataframes are not
>>      >   atomic vectors, so it would be reasonable to generate an
>>      >   error.  (See ?is.atomic for a definition of atomic
>>      >   vectors.)
>>
>>      >   I'll see if this causes a lot of trouble...
>>
>>      >   Duncan Murdoch
>>
>>  Duncan,
>>  do you remember the issue of mean(), var(), median(),... etc
>>  that was the topic a few weeks ago ?
>>
>>  I strongly advocated that  mean.data.frame() should become
>>  *deprecated*, and I would propose the same for the functions
>>  mentioned here.
>
>  I think you may have misunderstood my proposal.  Currently is.nan, is.finite 
and is.infinite have no data.frame methods, so the default method is used.  The 
problem is that the default method is too permissive:  it operates on the 
data.frame by treating it as a list; then it returns FALSE for each list element.  
(If there is only one row, it applies the test to the singleton in the column.)   
This is pretty strange default behaviour.
>
>  What I'm proposing is that the default method should trigger an error if you 
try to send it anything that's not atomic.  This gives sensible behaviour in most 
cases; the only one where it doesn't work is a list of singletons, which used to 
be handled sensibly, but will now fail.
>
>  (There's still a question about what the answer should be for these 
functions when applied to character or raw vectors, which are both atomic.  I'm 
leaning towards returning FALSE for every element, which matches the current 
behaviour, but perhaps those should also generate an error.)
>
>  I think this partially addresses Bill's objection, but not completely.  
Someone could still put a class on an atomic vector, and that might not be handled 
properly by the default method.
>
>>  People should  *apply (or *ply) on data frames, and not expect
>>  that all kind of functions have data.frame methods
>>  which are simply equivalent to basically  sapply(<df>,<function>)
>>
>>  {and yes -- all this belongs to R-devel rather than R-help}
>
>  Where I've moved it now.
>
>  Duncan Murdoch
>>  Martin


I snipped some of the older content and added Bill.

It seems to me that unless the 'x' argument is both atomic and numeric, these 
functions really don't have much utility, if you are going to implement 
standard default behavior and more rigorous error checking.

So I would support adding an error message if both conditions are not passed, 
rather than an unpredictable result, which an unsuspecting useR might not catch.

I agree that the non-default methods should be deprecated.

Regards,

Marc


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] NaN, Inf to NA

Reply via email to