[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Sat, 2023-12-23 at 09:56 -0500, Marten van Kerkwijk wrote: > Hi Sebastian, > > > That looks nice, I don't have a clear feeling on the order of > > items, if > > we think of it in terms of `(start, stop)` there was also the idea > > voiced to simply add another name in which case you would allow > > start > > and stop to be separate arrays. > > Yes, one could add another method. Or perhaps even add a new > argument > to `.reduce` instead (say `slices`). But this seemed the simplest > route... > > > Of course if go with your `slice(start, stop)` idea that also > > works, > > although passing as separate parameters seems nice too. > > > > Adding another name (if we can think of one at least) seems pretty > > good > > to me, since I suspect we would add docs to suggest not using > > `reduceat`. > > If we'd want to, even with the present PR it would be possible to > (very > slowly) deprecate the use of a list of single integers. But I'm > trying > to go with just making the existing method more useful. > > > One small thing about the PR: I would like to distinct `default` > > and > > `initial`. I.e. the default value is used only for empty > > reductions, > > while the initial value should be always used (unless you would > > pass > > both, which we don't for normal reductions though). > > I suppose the machinery isn't quite set up to do both side-by-side. > > I just followed what is done for reduce, where a default could also > have > made sense given that `where` can exclude all inputs along a given > row. > I'm not convinced it would be necessary to have both, though it would > not be hard to add. Was looking at the PR, which still seems worthwhile, although not urgnet right now. But, this makes me think (loudly ;)) that the `get_reduction_initial` should maybe distinguish this more fully... Because there are 3 cases, even if we only use the first two currently: 1. True idenity: default and initial are the same. 2. Default but no initial: Object sum has no initial, but does use `0` as default. 3. Initial is not valid default: This would be useful to simplify min/max reductions: `-inf` or `MIN_INT` are valid initial values but are not valid default values. - Sebastian > > All the best, > > Marten > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Sat, 2023-12-23 at 09:56 -0500, Marten van Kerkwijk wrote: > Hi Sebastian, > > > That looks nice, I don't have a clear feeling on the order of > > items, if > > we think of it in terms of `(start, stop)` there was also the idea > > voiced to simply add another name in which case you would allow > > start > > and stop to be separate arrays. > > Yes, one could add another method. Or perhaps even add a new > argument > to `.reduce` instead (say `slices`). But this seemed the simplest > route... Yeah, I don't mind this, doesn't stop us from a better idea either. Adding to `.reduce` could be fine, but overall I actually think a new name or using `reduceat` is nicer than overloading it more, even `reduce_slices()`. > > > > > > I suppose the machinery isn't quite set up to do both side-by-side. > > I just followed what is done for reduce, where a default could also > have > made sense given that `where` can exclude all inputs along a given > row. > I'm not convinced it would be necessary to have both, though it would > not be hard to add. Sorry, I misread the code: You do use initial the same way as in reductions, I thought it wasn't used when there were multiple elements. I.e. it is used for non-empty slices also. There is still a little annoyance when `initial=` isn't passed, since default/initial can be different (this is the case for object add for example: the default is `0`, but it is not used as initial for non empty reductions). Anyway, its a small details to some degree even if it may be finicky to get right. At the moment it seems passing `dtype=object` somehow changes the result also. - Sebastian > > All the best, > > Marten > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
Hi Sebastian, > That looks nice, I don't have a clear feeling on the order of items, if > we think of it in terms of `(start, stop)` there was also the idea > voiced to simply add another name in which case you would allow start > and stop to be separate arrays. Yes, one could add another method. Or perhaps even add a new argument to `.reduce` instead (say `slices`). But this seemed the simplest route... > Of course if go with your `slice(start, stop)` idea that also works, > although passing as separate parameters seems nice too. > > Adding another name (if we can think of one at least) seems pretty good > to me, since I suspect we would add docs to suggest not using > `reduceat`. If we'd want to, even with the present PR it would be possible to (very slowly) deprecate the use of a list of single integers. But I'm trying to go with just making the existing method more useful. > One small thing about the PR: I would like to distinct `default` and > `initial`. I.e. the default value is used only for empty reductions, > while the initial value should be always used (unless you would pass > both, which we don't for normal reductions though). > I suppose the machinery isn't quite set up to do both side-by-side. I just followed what is done for reduce, where a default could also have made sense given that `where` can exclude all inputs along a given row. I'm not convinced it would be necessary to have both, though it would not be hard to add. All the best, Marten ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Fri, 2023-12-22 at 18:01 -0500, Marten van Kerkwijk wrote: > Hi Martin, > > I agree it is a long-standing issue, and I was reminded of it by your > comment. I have a draft PR at > https://github.com/numpy/numpy/pull/25476 > that does not change the old behaviour, but allows you to pass in a > start-stop array which behaves more sensibly (exact API TBD). > > Please have a look! That looks nice, I don't have a clear feeling on the order of items, if we think of it in terms of `(start, stop)` there was also the idea voiced to simply add another name in which case you would allow start and stop to be separate arrays. Of course if go with your `slice(start, stop)` idea that also works, although passing as separate parameters seems nice too. Adding another name (if we can think of one at least) seems pretty good to me, since I suspect we would add docs to suggest not using `reduceat`. One small thing about the PR: I would like to distinct `default` and `initial`. I.e. the default value is used only for empty reductions, while the initial value should be always used (unless you would pass both, which we don't for normal reductions though). I suppose the machinery isn't quite set up to do both side-by-side. - Sebastian > > Marten > > Martin Ling writes: > > > Hi folks, > > > > I don't follow numpy development in much detail these days but I > > see > > that there is a 2.0 release planned soon. > > > > Would this be an opportunity to change the behaviour of 'reduceat'? > > > > This issue has been open in some form since 2006! > > https://github.com/numpy/numpy/issues/834 > > > > The current behaviour was originally inherited from Numeric, and > > makes > > reduceat often unusable in practice, even where it should be the > > perfect, concise, efficient solution. But it has been impossible to > > change it without breaking compatibіlity with existing code. > > > > As a result, horrible hacks are needed instead, e.g. my answer > > here: > > https://stackoverflow.com/questions/57694003 > > > > Is this something that could finally be fixed in 2.0? > > > > > > Martin > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: m...@astro.utoronto.ca > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
Hi Martin, I agree it is a long-standing issue, and I was reminded of it by your comment. I have a draft PR at https://github.com/numpy/numpy/pull/25476 that does not change the old behaviour, but allows you to pass in a start-stop array which behaves more sensibly (exact API TBD). Please have a look! Marten Martin Ling writes: > Hi folks, > > I don't follow numpy development in much detail these days but I see > that there is a 2.0 release planned soon. > > Would this be an opportunity to change the behaviour of 'reduceat'? > > This issue has been open in some form since 2006! > https://github.com/numpy/numpy/issues/834 > > The current behaviour was originally inherited from Numeric, and makes > reduceat often unusable in practice, even where it should be the > perfect, concise, efficient solution. But it has been impossible to > change it without breaking compatibіlity with existing code. > > As a result, horrible hacks are needed instead, e.g. my answer here: > https://stackoverflow.com/questions/57694003 > > Is this something that could finally be fixed in 2.0? > > > Martin > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: m...@astro.utoronto.ca ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Fri, Dec 22, 2023 at 12:34 PM Martin Ling wrote: > Hi folks, > > I don't follow numpy development in much detail these days but I see > that there is a 2.0 release planned soon. > > Would this be an opportunity to change the behaviour of 'reduceat'? > > This issue has been open in some form since 2006! > https://github.com/numpy/numpy/issues/834 > > The current behaviour was originally inherited from Numeric, and makes > reduceat often unusable in practice, even where it should be the > perfect, concise, efficient solution. But it has been impossible to > change it without breaking compatibіlity with existing code. > > As a result, horrible hacks are needed instead, e.g. my answer here: > https://stackoverflow.com/questions/57694003 > > Is this something that could finally be fixed in 2.0? The reduceat API is certainly problematic, but I don't think fixing it is really a NumPy 2.0 thing. As discussed in that issue, the right way to fix that is to add a new API with the correct behavior, and then we can think about deprecating (and maybe eventually removing) the current reduceat method. If the new reducebins() method were available, I would say removing reduceat() would be appropriate to consider for NumPy 2, but we don't have the new method with fixed behavior yet, which is the bigger blocker. > > > Martin > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sho...@gmail.com > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com