Re: [Numpy-discussion] An NA compromise idea -- many-NA

Skipper Seabold Fri, 01 Jul 2011 13:02:22 -0700

On Fri, Jul 1, 2011 at 3:46 PM, Dag Sverre Seljebotn
<[email protected]> wrote:
> I propose a simple idea *for the long term* for generalizing Mark's
> proposal, that I hope may perhaps put some people behind Mark's concrete
> proposal in the short term.
>
> If key feature missing in Mark's proposal is the ability to distinguish
> between different reason for NA-ness; IGNORE vs. NA. However, one could
> conceive wanting to track a whole host of reasons:
>
> homework_grades = np.asarray([2, 3, 1, EATEN_BY_DOG, 5, SICK, 2, TOO_LAZY])
>
> Wouldn't it be a shame to put a lot of work into NA, but then have users
> to still keep a seperate "shadow-array" for stuff like this?
>
> a) In this case the generality of Mark's proposal seems justified and
> less confusing to teach newcomers (?)
>
> b) Since Mark's proposal seems to generalize well to many NAs (there's 8
> bits in the mask, and millions of available NaN-s in floating point), if
> people agreed to this one could leave it for later and just go on with
> the proposed idea.
>


I have not been following the discussion in much detail, so forgive me
if this has come up. But I think this approach is also similar to
thinking behind missing values in SAS and "extended" missing values in
Stata. They are missing but preserve an order. This way you can pull
out values that are missing because they were eaten by a dog and see
if these missing ones are systematically different than the ones that
are missing because they're too lazy. Use case that pops to mind,
seeing if the various ways of attrition in surveys or experiments
varies in a non-random way.

http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000989180.htm
http://www.stata.com/help.cgi?missing

Maybe this is neither here nor there, I just don't want to end up with
the R way is the only way. That's why I prefer Python :)

Skipper
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] An NA compromise idea -- many-NA

Reply via email to