It's taken a lot of changes to get the NA mask support to its current point, but the code ready for some testing now. You can read the work-in-progress release notes here:
https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst To try it out, check out the missingdata branch from my github account, here, and build in the standard way: https://github.com/m-paradox/numpy The things most important to test are: * Confirm that existing code still works correctly. I've tested against SciPy and matplotlib. * Confirm that the performance of code not using NA masks is the same or better. * Try to do computations with the NA values, find places they don't work yet, and nominate unimplemented functionality important to you to be next on the development list. The release notes have a preliminary list of implemented/unimplemented functions. * Report any crashes, build problems, or unexpected behaviors. In addition to adding the NA mask, I've also added features and done a few performance changes here and there, like letting reductions like sum take lists of axes instead of being a single axis or all of them. These changes affect various bugs like http://projects.scipy.org/numpy/ticket/1143 and http://projects.scipy.org/numpy/ticket/533. Thanks! Mark Here's a small example run using NAs: >>> import numpy as np >>> np.__version__ '2.0.0.dev-8a5e2a1' >>> a = np.random.rand(3,3,3) >>> a.flags.maskna = True >>> a[np.random.rand(3,3,3) < 0.5] = np.NA >>> a array([[[NA, NA, 0.11511708], [ 0.46661454, 0.47565512, NA], [NA, NA, NA]], [[NA, 0.57860351, NA], [NA, NA, 0.72012669], [ 0.36582123, NA, 0.76289794]], [[ 0.65322748, 0.92794386, NA], [ 0.53745165, 0.97520989, 0.17515083], [ 0.71219688, 0.5184328 , 0.75802805]]]) >>> np.mean(a, axis=-1) array([[NA, NA, NA], [NA, NA, NA], [NA, 0.56260412, 0.66288591]]) >>> np.std(a, axis=-1) array([[NA, NA, NA], [NA, NA, NA], [NA, 0.32710662, 0.10384331]]) >>> np.mean(a, axis=-1, skipna=True) /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474: RuntimeWarning: invalid value encountered in true_divide um.true_divide(ret, rcount, out=ret, casting='unsafe') array([[ 0.11511708, 0.47113483, nan], [ 0.57860351, 0.72012669, 0.56435958], [ 0.79058567, 0.56260412, 0.66288591]]) >>> np.std(a, axis=-1, skipna=True) /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707: RuntimeWarning: invalid value encountered in true_divide um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe') /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730: RuntimeWarning: invalid value encountered in true_divide um.true_divide(ret, rcount, out=ret, casting='unsafe') array([[ 0. , 0.00452029, nan], [ 0. , 0. , 0.19853835], [ 0.13735819, 0.32710662, 0.10384331]]) >>> np.std(a, axis=(1,2), skipna=True) array([ 0.16786895, 0.15498008, 0.23811937])
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion