Hi, On Thu, May 10, 2012 at 2:43 AM, Nathaniel Smith <n...@pobox.com> wrote: > Hi Matthew, > > On Thu, May 10, 2012 at 12:01 AM, Matthew Brett <matthew.br...@gmail.com> > wrote: >>> The third proposal is certainly the best one from Cython's perspective; >>> and I imagine for those writing C extensions against the C API too. >>> Having PyType_Check fail for ndmasked is a very good way of having code >>> fail that is not written to take masks into account. >> >> Mark, Nathaniel - can you comment how your chosen approaches would >> interact with extension code? >> >> I'm guessing the bitpattern dtypes would be expected to cause >> extension code to choke if the type is not supported? > > That's pretty much how I'm imagining it, yes. Right now if you have, > say, a Cython function like > > cdef f(np.ndarray[double] a): > ... > > and you do f(np.zeros(10, dtype=int)), then it will error out, because > that function doesn't know how to handle ints, only doubles. The same > would apply for, say, a NA-enabled integer. In general there are > almost arbitrarily many dtypes that could get passed into any function > (including user-defined ones, etc.), so C code already has to check > dtypes for correctness. > > Second order issues: > - There is certainly C code out there that just assumes that it will > only be passed an array with certain dtype (and ndim, memory layout, > etc...). If you write such C code then it's your job to make sure that > you only pass it the kinds of arrays that it expects, just like now > :-). > > - We may want to do some sort of special-casing of handling for > floating point NA dtypes that use an NaN as the "magic" bitpattern, > since many algorithms *will* work with these unchanged, and it might > be frustrating to have to wait for every extension module to be > updated just to allow for this case explicitly before using them. OTOH > you can easily work around this. Like say my_qr is a legacy C function > that will in fact propagate NaNs correctly, so float NA dtypes would > Just Work -- except, it errors out at the start because it doesn't > recognize the dtype. How annoying. We *could* have some special hack > you can use to force it to work anyway (by like making the "is this > the dtype I expect?" routine lie.) But you can also just do: > > def my_qr_wrapper(arr): > if arr.dtype is a NA float dtype with NaN magic value: > result = my_qr(arr.view(arr.dtype.base_dtype)) > return result.view(arr.dtype) > else: > return my_qr(arr) > > and hey presto, now it will correctly pass through NAs. So perhaps > it's not worth bothering with special hacks. > > - Of course if your extension function does want to handle NAs > generically, then there will be a simple C api for checking for them, > setting them, etc. Numpy needs such an API internally anyway!
Thanks for this. Mark - in view of the discussions about Cython and extension code - could you say what you see as disadvantages to the ndmasked subclass proposal? Cheers, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion