On 02/19/2012 02:53 PM, Jordi Gutiérrez Hermoso wrote:
> 2012/2/19 Alois Schlögl<alois.schlo...@ist.ac.at>:
>> The users of the NaN-toolbox can choose the NaN-propagating function
>> sum() or the NaN-skipping function sumskipnan().
>
> Oh, this is nice. I'm sorry, I thought you also shadowed the core sum
> function.
>
> But why do you shadow the other functions?


There are several reasons:

-- In several cases, there is just no need for a nan-propagating 
function. In statistics the NaN-skipping behavior is the correct thing 
to do. One computes always some expectation value which is always some 
average:
   E < f(x) >  = 1/N SUM_i { f(x) }

e.g. var, kurtosis, mean, meansq, iqr, median, std, skewness, moment, 
range, cov, cor, prctile, quantile.

Having two functions for everyone of these, is confusing and 
complicated. It makes code unreadable, because one needs to thing 
whether the "standard"/nan-propagating function has been used on 
purpose, or whether the author did not care about NaN's.

-- In some other cases, its the reason above in combination with some 
extended functionality (like significance test),
    e.g. corrcoef, spearman,

or NaN propagation of individual values instead of converting whole 
rows/columns in NaNs because of a single NaN.
   e.g. ranks, zscore, center

    octave:1> center([1,NaN,2])
    ans =
       NaN   NaN   NaN
    octave:1> ranks([1,NaN,2])
    ans =
      1   3   2
    is not what you want.

-- In some cases, it's just some bug fixes or some extended 
functionality of core functions (some might be fixed in the recent 
releases), or functions that are missing in matlab core (nan and tsa try 
to be compatible to both, matlab and octave):
    e.g. tpdf, tinv, tcdf, normcdf, normpdf, norminv,

Running nantest.m without installing the NaN-toolbox will give you an 
idea about the limitations of some core functions.

-- Users that are inexperienced with the peculiarities of NaNs will 
likely use the standard function name (without the NaN-prefix), with 
good chances that it does not do the right thing. When they try to "fix" 
the problem, chances are that it gets worse
(e.g. x==NaN is not is the same as isnan(x) ), which can (i) in worst 
case give incorrect results, (ii) but at least they get no results even 
so it would be possible to get a result. The NaN-toolbox provides a 
simple, straight-forward and proper way of dealing with data containing 
NaNs. For this reason, I'd actually recommend the NaN-toolbox to 
beginners, because its more likely that that they get the handling of 
NaNs right.


    Alois



------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Octave-dev mailing list
Octave-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/octave-dev

Reply via email to