On Fri, Oct 19, 2018 at 7:50 PM Eric Wieser <wieser.eric+nu...@gmail.com> wrote:
> Subclasses such as MaskedArray and, yes, Quantity, are widely used, and if > they cause problems perhaps that should be seen as a sign that ndarray > subclassing should be made easier and clearer. > > Both maskedarray and quantity seem like something that would make more > sense at the dtype level if our dtype system was easier to extend. It might > be good to compile a list of subclassing applications, and split them into > “this ought to be a dtype” and “this ought to be a different type of > container”. > Wes Mckinney has been benchmarking masks vs sentinel values for arrow: http://wesmckinney.com/blog/bitmaps-vs-sentinel-values/. The (bit) masks are faster. I'm not convinced dtypes are the way to go. Chuck > On Fri, 19 Oct 2018 at 18:24 Marten van Kerkwijk < > m.h.vankerkw...@gmail.com> wrote: > >> Hi All, >> >> It seems there are two extreme possibilities for general functions: >> 1. Put `asarray` everywhere. The main benefit that I can see is that even >> if people put in list instead of arrays, one is guaranteed to have shape, >> dtype, etc. But it seems a bit like calling `int` on everything that might >> get used as an index, instead of letting the actual indexing do the proper >> thing and call `__index__`. >> 2. Do not coerce at all, but rather write code assuming something is an >> array already. This will often, but not always, just work for array mimics, >> with coercion done only where necessary (e.g., in lower-lying C code such >> as that of the ufuncs which has a smaller API surface and can be overridden >> more easily). >> >> The current __array_function__ work may well provide us with a way to >> combine both, if we (over time) move the coercion inside >> `ndarray.__array_function__` so that the actual implementation *can* assume >> it deals with pure ndarray - then, when relevant, calling that >> implementation will be what subclasses/duck arrays can happily do (and it >> is up to them to ensure this works). >> >> Of course, the above does not really answer what to do in the meantime. >> But perhaps it helps in thinking of what we are actually aiming for. >> >> One last thing: could we please stop bashing subclasses? One can subclass >> essentially everything in python, often to great advantage. Subclasses such >> as MaskedArray and, yes, Quantity, are widely used, and if they cause >> problems perhaps that should be seen as a sign that ndarray subclassing >> should be made easier and clearer. >> >> All the best, >> >> Marten >> >> >> On Fri, Oct 19, 2018 at 7:02 PM Ralf Gommers <ralf.gomm...@gmail.com> >> wrote: >> >>> >>> >>> On Fri, Oct 19, 2018 at 10:28 PM Ralf Gommers <ralf.gomm...@gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Fri, Oct 19, 2018 at 4:15 PM Hameer Abbasi < >>>> einstein.edi...@gmail.com> wrote: >>>> >>>>> Hi! >>>>> >>>>> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <sho...@gmail.com> >>>>> wrote: >>>>> I don't think it makes much sense to change NumPy's existing usage of >>>>> asarray() to asanyarray() unless we add subok=True arguments (which >>>>> default >>>>> to False). But this ends up cluttering NumPy's public API, which is also >>>>> undesirable. >>>>> >>>>> Agreed so far. >>>>> >>>> >>>> I'm not sure I agree. "subok" is very unpythonic; the average numpy >>>> library function should work fine for a well-behaved subclass (i.e. most >>>> things out there except np.matrix). >>>> >>>>> >>>>> The preferred way to override NumPy functions going forward should be >>>>> __array_function__. >>>>> >>>>> >>>>> I think we should “soft support” i.e. allow but consider unsupported, >>>>> the case where one of NumPy’s functions is implemented in terms of others >>>>> and “passing through” an array results in the correct behaviour for that >>>>> array. >>>>> >>>> >>>> I don't think we have or want such a concept as "soft support". We >>>> intend to not break anything that now has asanyarray, i.e. it's supported >>>> and ideally we have regression tests for all such functions. For anything >>>> we transition over from asarray to asanyarray, PRs should come with new >>>> tests. >>>> >>>> >>>>> >>>>> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk < >>>>> m.h.vankerkw...@gmail.com> wrote: >>>>> >>>>>> There are exceptions for `matrix` in quite a few places, and there >>>>>> now is warning for `maxtrix` - it might not be bad to use `asanyarray` >>>>>> and >>>>>> add an exception for `maxtrix`. Indeed, I quite like the suggestion by >>>>>> Eric >>>>>> Wieser to just add the exception to `asanyarray` itself - that way when >>>>>> matrix is truly deprecated, it will be a very easy change. >>>>>> >>>>> I don't quite understand this. Adding exceptions is not deprecation - >>>> we then may as well just rip np.matrix out straight away. >>>> >>>> What I suggested in the call about this issue is that it's not very >>>> effective to treat functions like percentile/quantile one by one without an >>>> overarching strategy. A way forward could be for someone to write an >>>> overview of which sets of functions now have asanyarray (and actually work >>>> with subclasses), which ones we can and want to change now, and which ones >>>> we can and want to change after np.matrix is gone. Also, some guidelines >>>> for new functions that we add to numpy would be handy. I suspect we've been >>>> adding new functions that use asarray rather than asanyarray, which is >>>> probably undesired. >>>> >>> >>> Thanks Nathaniel and Stephan. Your comments on my other two points are >>> both clear and correct (and have been made a number of times before). I >>> think the "write an overview so we can stop making ad-hoc decisions and >>> having these discussions" is the most important point I was trying to make >>> though. If we had such a doc and it concluded "hence we don't change >>> anything, __array_function__ is the only way to go" then we can just close >>> PRs like https://github.com/numpy/numpy/pull/11162 straight away. >>> >>> Cheers, >>> Ralf >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion