On Thu, May 10, 2012 at 10:28 PM, Matthew Brett <matthew.br...@gmail.com>wrote:
> Hi, > > On Thu, May 10, 2012 at 2:43 AM, Nathaniel Smith <n...@pobox.com> wrote: > > Hi Matthew, > > > > On Thu, May 10, 2012 at 12:01 AM, Matthew Brett <matthew.br...@gmail.com> > wrote: > >>> The third proposal is certainly the best one from Cython's perspective; > >>> and I imagine for those writing C extensions against the C API too. > >>> Having PyType_Check fail for ndmasked is a very good way of having code > >>> fail that is not written to take masks into account. > >> > >> Mark, Nathaniel - can you comment how your chosen approaches would > >> interact with extension code? > >> > >> I'm guessing the bitpattern dtypes would be expected to cause > >> extension code to choke if the type is not supported? > > > > That's pretty much how I'm imagining it, yes. Right now if you have, > > say, a Cython function like > > > > cdef f(np.ndarray[double] a): > > ... > > > > and you do f(np.zeros(10, dtype=int)), then it will error out, because > > that function doesn't know how to handle ints, only doubles. The same > > would apply for, say, a NA-enabled integer. In general there are > > almost arbitrarily many dtypes that could get passed into any function > > (including user-defined ones, etc.), so C code already has to check > > dtypes for correctness. > > > > Second order issues: > > - There is certainly C code out there that just assumes that it will > > only be passed an array with certain dtype (and ndim, memory layout, > > etc...). If you write such C code then it's your job to make sure that > > you only pass it the kinds of arrays that it expects, just like now > > :-). > > > > - We may want to do some sort of special-casing of handling for > > floating point NA dtypes that use an NaN as the "magic" bitpattern, > > since many algorithms *will* work with these unchanged, and it might > > be frustrating to have to wait for every extension module to be > > updated just to allow for this case explicitly before using them. OTOH > > you can easily work around this. Like say my_qr is a legacy C function > > that will in fact propagate NaNs correctly, so float NA dtypes would > > Just Work -- except, it errors out at the start because it doesn't > > recognize the dtype. How annoying. We *could* have some special hack > > you can use to force it to work anyway (by like making the "is this > > the dtype I expect?" routine lie.) But you can also just do: > > > > def my_qr_wrapper(arr): > > if arr.dtype is a NA float dtype with NaN magic value: > > result = my_qr(arr.view(arr.dtype.base_dtype)) > > return result.view(arr.dtype) > > else: > > return my_qr(arr) > > > > and hey presto, now it will correctly pass through NAs. So perhaps > > it's not worth bothering with special hacks. > > > > - Of course if your extension function does want to handle NAs > > generically, then there will be a simple C api for checking for them, > > setting them, etc. Numpy needs such an API internally anyway! > > Thanks for this. > > Mark - in view of the discussions about Cython and extension code - > could you say what you see as disadvantages to the ndmasked subclass > proposal? > The biggest difficulty looks to me like how to work with both of them reasonably from the C API. The idea of ndarray and ndmasked having different independent TypeObjects, but still working through the same API calls feels a little disconcerting. Maybe this is a reasonable compromise, though, it would be nice to see the idea fleshed out a bit more with some examples of how the code would work from the C level. Cheers, Mark > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion