-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Søren Hauberg wrote:
> lør, 07 03 2009 kl. 09:42 -0500, skrev James K. Lowden:
>> Alois Schlögl wrote:
>>> Skipping NA/NaN is valid for the mean as well as for any other
>>> statistical estimate. 
>> That is not always so.  Suppose you intend to compute the mean of N values
>> but due to an error in your database query, 90% of those values are
>> missing.  Are you prepared to say that the mean of the other 10% is
>> representative?  
> 
> I would say that was the best estimate you could possibly get.


Exactely for this reason, the nan-skipping is the right thing to do.
Actually, NaN/sem.m gives the confidence interval on the mean, if you
really need it.

I do not understand what advantage it has to distinguish between NaN and
NA. In the database, there might be not-a-number due to missing data and
some due to a division 0/0. In order to get the best estimate (or an
estimate at all), you need to ignore both NA's and NaN's.

Moreover, the distinction between NaN and NA's complicates thinks again
Should one set a sample to NA or to NaN, there is an overflow in my data
acquisition? Justing the need to think about it pointless.


> 
>> NaNs convey meaning, as Søren said.  
> 
> Actually, what I said was that there was a difference between something
> being not-a-number, and something being missing. It makes perfect sense
> to skip missing values when computing the mean value (in the statistical
> sense). However, it does not make sense to ignore NaN's when they convey
> the meaning that something went wrong somewhere else in your program.
> Jaroslav explained this well.


In such a case, I strongly recommend an explicit handling of NaN's. The
code would emphasize that the NaN's "convey meaning".



Alois

> 
> Søren
> 
> 
> ------------------------------------------------------------------------------
> Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
> -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
> -Strategies to boost innovation and cut costs with open source participation
> -Receive a $600 discount off the registration fee with the source code: SFAD
> http://p.sf.net/sfu/XcvMzF8H
> _______________________________________________
> Octave-dev mailing list
> Octave-dev@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/octave-dev

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkm0xW4ACgkQzSlbmAlvEIiY3gCfT19ceI6VQk+sIv8Yne9yNM8F
GQUAn2nGZmF0oUrDRamboAXJ5mSZ4gb8
=zYhj
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Octave-dev mailing list
Octave-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/octave-dev

Reply via email to