On Thu, Oct 23, 2025 at 2:18 PM Marten van Kerkwijk via NumPy-Discussion < [email protected]> wrote:
> Hi Ralf, > > > So I think the relevant choices are: > > 1. Change nothing to the current status quo (and possibly direct end > users who need more than > > what we offer now to `marray`) > > 2. Add a keyword to reductions > > 3. Add a single factory function that turns regular reductions into > nan-aware ones (as in > > > https://github.com/data-apis/array-api/issues/621#issuecomment-1553481118) > > > > I think (1) is also a very reasonable outcome if we don't like any of > the alternatives. > > I am fine with (1), continue to dislike (2), and like (3). > > On (1) [status quo], you mentioned that nanptp was rejected earlier as a > new addition to nanfunctions. If this was because we didn't want to > expand the main numpy namespace (reasonable!), Indeed. There perhaps also was a "this is too niche anyway" thought, but IIRC not polluting the main namespace was the primary consideration. > might a sub-option be to > allow expansion in nanfunctions for any regular function in the numpy > namespace, but only expose them in nanfunctions itself? An advantage > would be that, effectively, those who like to omit NaN could just do > "import numpy.lib.nanfunctions as np". I'll note that that is not a public namespace right now. It could be created of course, if there is energy. > Of course, at that point perhaps > one should just bite the bullet and move nanfunctions out to its own > package... > Like https://github.com/pydata/bottleneck? It already has faster nan-functions as well as some extra ones (anynan, allnan, nanrankdata). Of course it's been on life support for a while, but it's in decent shape. On (2) [keyword argument], I continue to dislike the idea of adding new > keyword arguments for the ufunc reductions -- ufuncs are one of the few > bits of numpy API that are really nicely clean and consistent between > many functions. We have been very careful about extending it, and > keeping it light. They already allow `np.sum(data, where=~isnan(data)`, > it is not obvious why we would add another option to do the same thing. > Obviously, one could argue that np.sum != np.add.reduce, so their > signatures can diverge, but I'd personally like to move in the opposite > direction (if only for speed for small arrays). > Fair enough. > On (3) [factory function], I think a side benefit is that it is the > lightest possible way to make useful what is required anyway, creating > wrappers/implementations for functions not yet covered by nanfunctions. > That "lightest possible way" is why I suggested it indeed - but it seems not many people shared my preference for that option. My suggestion of a nan-as-omit Array API compatible wrapper class would > need them, and so would extending nanfunctions to cover more cases. > Indeed, it would even help the keyword-argument case as it would provide > working implementations. > > Let me also mention again another option, of a wrapper data type which > translates floats with NaN to a floats with nan replaced by an > appropriate constant (identify from reductions by default). I think you can't determine an appropriate value without already doing the nan-omitting calculation? E.g. what replacement value would you use for `np.mean`? To > opt in, one would do something like, > > function(array.astype(NaNOmittingFloat), ...) > > But really one could initialize arrays like that and just keep working > with them. Of course, this would rely completely on Sebastian's custom > dtype mechanism, which has already proven its worth in StringDType, but > which would likely not be recognized by other array classes. For that, > a custom array class would be best (though given marray that may > actually not be much work at all -- just need to have the mask always > inferred instead of kept as a separate array). > > All the best, > > Marten > > p.s. I liked the little summary of what other languages do in > https://github.com/data-apis/array-api/issues/621#issuecomment-1569485778 > Julia's seemed a nice functional approach -- it seems a very interesting > language in general, from which it is probably worth getting more ideas... > Agreed, Julia has some nice ideas. Cheers, Ralf
_______________________________________________ NumPy-Discussion mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: [email protected]
