Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
Hi, I would like to revitalize the discussion on including PR#7804 (atleast_nd function) at Stephan Hoyer's request. atleast_nd has come up as a convenient workaround for #8206 (adding padding options to diff) to be able to do broadcasting with the required dimensions reversed. Regards, -Joe On Mon, Jul 11, 2016 at 10:41 AM, Joseph Fox-Rabinovitz < jfoxrabinov...@gmail.com> wrote: > I would like to follow up on my original PR (7804). While there > appears to be some debate as to whether the PR is numpy material to > begin with, there do not appear to be any technical issues with it. To > make the decision more straightforward, I factored out the > non-controversial bug fixes to masked arrays into PR #7823, along with > their regression tests. This way, the original enhancement can be > closed or left hanging indefinitely, (even though I hope neither > happens). PR 7804 still has the bug fixes duplicated in it. > > Regards, > > -Joe > > > On Thu, Jul 7, 2016 at 9:11 AM, Joseph Fox-Rabinovitz >wrote: > > On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg > > wrote: > >> On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote: > >>> I don't see how one could define a spec that would take an arbitrary > >>> array of indices at which to place new dimensions. By definition, you > >>> > >> > >> You just give a reordered range, so that (1, 0, 2) would be the current > >> 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, > >> add everything of course). > > > > I was originally thinking (-1, 0) for the 2D case. Just go along the > > list and fill as many dims as necessary. Your way is much better since > > it does not require a different operation for positive and negative > > indices. > > > >> However, I have my doubts that it is actually easier to understand then > >> to write yourself ;). > > > > A dictionary or ragged list would be better for that: either {1: (1, > > 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the > > index in the list is the starting ndim - 1. > > > >> > >> - Sebastian > >> > >> > >>> don't know how many dimensions are going to be added. If you knew, > >>> then you wouldn't be calling this function. I can only imagine simple > >>> rules such as 'left' or 'right' or maybe something akin to what > >>> at_least3d() implements. > >>> > >>> On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz >>> @gmail.com> wrote: > >>> > On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing > >>> > wrote: > >>> > > On 2016/07/06 8:25 AM, Benjamin Root wrote: > >>> > >> > >>> > >> I wouldn't have the keyword be "where", as that collides with > >>> > the notion > >>> > >> of "where" elsewhere in numpy. > >>> > > > >>> > > > >>> > > Agreed. Maybe "side"? > >>> > > >>> > I have tentatively changed it to "pos". The reason that I don't > >>> > like > >>> > "side" is that it implies only a subset of the possible ways that > >>> > that > >>> > the position of the new dimensions can be specified. The current > >>> > implementation only puts things on one side or the other, but I > >>> > have > >>> > considered also allowing an array of indices at which to place new > >>> > dimensions, and/or a dictionary keyed by the starting ndims. I do > >>> > not > >>> > think "side" would be appropriate for these extended cases, even if > >>> > they are very unlikely to ever materialize. > >>> > > >>> > -Joe > >>> > > >>> > > (I find atleast_1d and atleast_2d to be very helpful for handling > >>> > inputs, as > >>> > > Ben noted; I'm skeptical as to the value of atleast_3d and > >>> > atleast_nd.) > >>> > > > >>> > > Eric > >>> > > > >>> > > ___ > >>> > > NumPy-Discussion mailing list > >>> > > NumPy-Discussion@scipy.org > >>> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > ___ > >>> > NumPy-Discussion mailing list > >>> > NumPy-Discussion@scipy.org > >>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > > >>> ___ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion@scipy.org > >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> ___ > >> NumPy-Discussion mailing list > >> NumPy-Discussion@scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
I would like to follow up on my original PR (7804). While there appears to be some debate as to whether the PR is numpy material to begin with, there do not appear to be any technical issues with it. To make the decision more straightforward, I factored out the non-controversial bug fixes to masked arrays into PR #7823, along with their regression tests. This way, the original enhancement can be closed or left hanging indefinitely, (even though I hope neither happens). PR 7804 still has the bug fixes duplicated in it. Regards, -Joe On Thu, Jul 7, 2016 at 9:11 AM, Joseph Fox-Rabinovitzwrote: > On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg > wrote: >> On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote: >>> I don't see how one could define a spec that would take an arbitrary >>> array of indices at which to place new dimensions. By definition, you >>> >> >> You just give a reordered range, so that (1, 0, 2) would be the current >> 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, >> add everything of course). > > I was originally thinking (-1, 0) for the 2D case. Just go along the > list and fill as many dims as necessary. Your way is much better since > it does not require a different operation for positive and negative > indices. > >> However, I have my doubts that it is actually easier to understand then >> to write yourself ;). > > A dictionary or ragged list would be better for that: either {1: (1, > 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the > index in the list is the starting ndim - 1. > >> >> - Sebastian >> >> >>> don't know how many dimensions are going to be added. If you knew, >>> then you wouldn't be calling this function. I can only imagine simple >>> rules such as 'left' or 'right' or maybe something akin to what >>> at_least3d() implements. >>> >>> On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz >> @gmail.com> wrote: >>> > On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing >>> > wrote: >>> > > On 2016/07/06 8:25 AM, Benjamin Root wrote: >>> > >> >>> > >> I wouldn't have the keyword be "where", as that collides with >>> > the notion >>> > >> of "where" elsewhere in numpy. >>> > > >>> > > >>> > > Agreed. Maybe "side"? >>> > >>> > I have tentatively changed it to "pos". The reason that I don't >>> > like >>> > "side" is that it implies only a subset of the possible ways that >>> > that >>> > the position of the new dimensions can be specified. The current >>> > implementation only puts things on one side or the other, but I >>> > have >>> > considered also allowing an array of indices at which to place new >>> > dimensions, and/or a dictionary keyed by the starting ndims. I do >>> > not >>> > think "side" would be appropriate for these extended cases, even if >>> > they are very unlikely to ever materialize. >>> > >>> > -Joe >>> > >>> > > (I find atleast_1d and atleast_2d to be very helpful for handling >>> > inputs, as >>> > > Ben noted; I'm skeptical as to the value of atleast_3d and >>> > atleast_nd.) >>> > > >>> > > Eric >>> > > >>> > > ___ >>> > > NumPy-Discussion mailing list >>> > > NumPy-Discussion@scipy.org >>> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > ___ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion@scipy.org >>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> ___ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Bergwrote: > On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote: >> I don't see how one could define a spec that would take an arbitrary >> array of indices at which to place new dimensions. By definition, you >> > > You just give a reordered range, so that (1, 0, 2) would be the current > 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, > add everything of course). I was originally thinking (-1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices. > However, I have my doubts that it is actually easier to understand then > to write yourself ;). A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim - 1. > > - Sebastian > > >> don't know how many dimensions are going to be added. If you knew, >> then you wouldn't be calling this function. I can only imagine simple >> rules such as 'left' or 'right' or maybe something akin to what >> at_least3d() implements. >> >> On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz > @gmail.com> wrote: >> > On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing >> > wrote: >> > > On 2016/07/06 8:25 AM, Benjamin Root wrote: >> > >> >> > >> I wouldn't have the keyword be "where", as that collides with >> > the notion >> > >> of "where" elsewhere in numpy. >> > > >> > > >> > > Agreed. Maybe "side"? >> > >> > I have tentatively changed it to "pos". The reason that I don't >> > like >> > "side" is that it implies only a subset of the possible ways that >> > that >> > the position of the new dimensions can be specified. The current >> > implementation only puts things on one side or the other, but I >> > have >> > considered also allowing an array of indices at which to place new >> > dimensions, and/or a dictionary keyed by the starting ndims. I do >> > not >> > think "side" would be appropriate for these extended cases, even if >> > they are very unlikely to ever materialize. >> > >> > -Joe >> > >> > > (I find atleast_1d and atleast_2d to be very helpful for handling >> > inputs, as >> > > Ben noted; I'm skeptical as to the value of atleast_3d and >> > atleast_nd.) >> > > >> > > Eric >> > > >> > > ___ >> > > NumPy-Discussion mailing list >> > > NumPy-Discussion@scipy.org >> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > ___ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote: > I don't see how one could define a spec that would take an arbitrary > array of indices at which to place new dimensions. By definition, you > You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course). However, I have my doubts that it is actually easier to understand then to write yourself ;). - Sebastian > don't know how many dimensions are going to be added. If you knew, > then you wouldn't be calling this function. I can only imagine simple > rules such as 'left' or 'right' or maybe something akin to what > at_least3d() implements. > > On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz @gmail.com> wrote: > > On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing> > wrote: > > > On 2016/07/06 8:25 AM, Benjamin Root wrote: > > >> > > >> I wouldn't have the keyword be "where", as that collides with > > the notion > > >> of "where" elsewhere in numpy. > > > > > > > > > Agreed. Maybe "side"? > > > > I have tentatively changed it to "pos". The reason that I don't > > like > > "side" is that it implies only a subset of the possible ways that > > that > > the position of the new dimensions can be specified. The current > > implementation only puts things on one side or the other, but I > > have > > considered also allowing an array of indices at which to place new > > dimensions, and/or a dictionary keyed by the starting ndims. I do > > not > > think "side" would be appropriate for these extended cases, even if > > they are very unlikely to ever materialize. > > > > -Joe > > > > > (I find atleast_1d and atleast_2d to be very helpful for handling > > inputs, as > > > Ben noted; I'm skeptical as to the value of atleast_3d and > > atleast_nd.) > > > > > > Eric > > > > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Wed, Jul 6, 2016 at 1:56 PM, Ralf Gommerswrote: > > > On Wed, Jul 6, 2016 at 6:26 PM, Nathaniel Smith wrote: > >> On Jul 5, 2016 11:21 PM, "Ralf Gommers" wrote: >> > >> > >> > >> > On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith wrote: >> > >> >> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" >> >> wrote: >> >> > >> >> > Hi, >> >> > >> >> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a >> >> > function np.atleast_nd in PR#7804 >> >> > (https://github.com/numpy/numpy/pull/7804). >> >> > >> >> > As a result of this PR, I have a couple of questions about >> >> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with >> >> > the dimensions: If the input is 1D, it prepends and appends a size-1 >> >> > dimension. If the input is 2D, it appends a size-1 dimension. This is >> >> > inconsistent with `np.atleast_2d`, which always prepends (as does >> >> > `np.atleast_nd`). >> >> > >> >> > - Is there any reason for this behavior? >> >> > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in >> >> > terms of `np.atleast_nd`, which is actually much simpler)? This would >> >> > be a slight API change since the output would not be exactly the >> >> > same. >> >> >> >> Changing atleast_3d seems likely to break a bunch of stuff... >> >> >> >> Beyond that, I find it hard to have an opinion about the best design >> >> for these functions, because I don't think I've ever encountered a >> >> situation >> >> where they were actually what I wanted. I'm not a big fan of coercing >> >> dimensions in the first place, for the usual "refuse to guess" reasons. >> >> And >> >> then generally if I do want to coerce an array to another dimension, then >> >> I >> >> have some opinion about where the new dimensions should go, and/or I have >> >> some opinion about the minimum acceptable starting dimension, and/or I >> >> have >> >> a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; >> >> 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that >> >> requirements list.) >> >> >> >> I don't know how typical I am in this. But it does make me wonder if >> >> the atleast_* functions act as an attractive nuisance, where new users >> >> take >> >> their presence as an implicit recommendation that they are actually a >> >> useful >> >> thing to reach for, even though they... aren't that. And maybe we should >> >> be >> >> recommending folk move away from them rather than trying to extend them >> >> further? >> >> >> >> Or maybe they're totally useful and I'm just missing it. What's your >> >> use case that motivates atleast_nd? >> > >> > I think you're just missing it:) atleast_1d/2d are used quite a bit in >> > Scipy and Statsmodels (those are the only ones I checked), and in the large >> > majority of cases it's the best thing to use there. There's a bunch of >> > atleast_2d calls with a transpose appended because the input needs to be >> > treated as columns instead of rows, but that's still efficient and readable >> > enough. >> >> I know people *use* it :-). What I'm confused about is in what situations >> you would invent it if it didn't exist. Can you point me to an example or >> two where it's "the best thing"? I actually had statsmodels in mind with my >> example of wanting the semantics "coerce 1d inputs into a column matrix; 0d >> or 3d inputs are an error". I'm surprised if there are places where you >> really want 0d arrays converted into 1x1, > > Scalar to shape (1,1) is less common, but 1-D to 2-D or scalar to shape (1,) > is very common. That's ravel, though, not atleast_*, right? > Example is at the top of scipy/stats/stats.py: the > _chk_asarray functions (used in many other functions) I feel like this actually argues for my point :-). scipy.stats needs some uniform prepping of input, so there's a helper function to do that, and the helper function's semantics are not at all the semantics of atleast_*. And they don't even use atleast_* in any necessary way -- the only thing they do is if arr.ndim ==0: arr = np.atleast_1d(arr) but this could be written just as well as if arr.ndim == 0: arr = arr[np.newaxis] (In any case, atleast_1d definitely makes more sense to me than any of the others, since it so obviously corresponds to exactly that 2-line incantation as the only reasonable implementation.) > take care to never > return scalar arrays because those are plain annoying to deal with. If that > sounds weird to you, you're probably one of those people who was never > surprised by this: > > In [3]: x0 = np.array(1) > > In [4]: x1 = np.array([1]) > > In [5]: x0[0] > --- > IndexErrorTraceback (most recent call last) > in () > > 1 x0[0] > > IndexError: too many indices for array > > In [6]: x1[0] >
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
Joseph Fox-Rabinovitz gmail.com> writes: > > On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing hawaii.edu> wrote: > > On 2016/07/06 8:25 AM, Benjamin Root wrote: > >> > >> I wouldn't have the keyword be "where", as that collides with the notion > >> of "where" elsewhere in numpy. > > > > > > Agreed. Maybe "side"? > > I have tentatively changed it to "pos". The reason that I don't like > "side" is that it implies only a subset of the possible ways that that > the position of the new dimensions can be specified. The current > implementation only puts things on one side or the other, but I have > considered also allowing an array of indices at which to place new > dimensions, and/or a dictionary keyed by the starting ndims. I do not > think "side" would be appropriate for these extended cases, even if > they are very unlikely to ever materialize. > > -Joe > > > (I find atleast_1d and atleast_2d to be very helpful for handling inputs, as > > Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.) > > > > Eric > > > > __ _ > > NumPy-Discussion mailing list > > NumPy-Discussion scipy.org > > https://mail.scipy.org/mailman/listinfo/nu mpy-discussion > About `order='C'` or `order='F'` for the argument name? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Wed, Jul 6, 2016 at 4:56 PM, Ralf Gommerswrote: > > > On Wed, Jul 6, 2016 at 6:26 PM, Nathaniel Smith wrote: > >> On Jul 5, 2016 11:21 PM, "Ralf Gommers" wrote: >> > >> > >> > >> > On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith wrote: >> > >> >> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" >> >> wrote: >> >> > >> >> > Hi, >> >> > >> >> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a >> >> > function np.atleast_nd in PR#7804 >> >> > (https://github.com/numpy/numpy/pull/7804). >> >> > >> >> > As a result of this PR, I have a couple of questions about >> >> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with >> >> > the dimensions: If the input is 1D, it prepends and appends a size-1 >> >> > dimension. If the input is 2D, it appends a size-1 dimension. This is >> >> > inconsistent with `np.atleast_2d`, which always prepends (as does >> >> > `np.atleast_nd`). >> >> > >> >> > - Is there any reason for this behavior? >> >> > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in >> >> > terms of `np.atleast_nd`, which is actually much simpler)? This would >> >> > be a slight API change since the output would not be exactly the >> >> > same. >> >> >> >> Changing atleast_3d seems likely to break a bunch of stuff... >> >> >> >> Beyond that, I find it hard to have an opinion about the best design >> >> for these functions, because I don't think I've ever encountered a >> >> situation >> >> where they were actually what I wanted. I'm not a big fan of coercing >> >> dimensions in the first place, for the usual "refuse to guess" reasons. >> >> And >> >> then generally if I do want to coerce an array to another dimension, then >> >> I >> >> have some opinion about where the new dimensions should go, and/or I have >> >> some opinion about the minimum acceptable starting dimension, and/or I >> >> have >> >> a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; >> >> 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that >> >> requirements list.) >> >> >> >> I don't know how typical I am in this. But it does make me wonder if >> >> the atleast_* functions act as an attractive nuisance, where new users >> >> take >> >> their presence as an implicit recommendation that they are actually a >> >> useful >> >> thing to reach for, even though they... aren't that. And maybe we should >> >> be >> >> recommending folk move away from them rather than trying to extend them >> >> further? >> >> >> >> Or maybe they're totally useful and I'm just missing it. What's your >> >> use case that motivates atleast_nd? >> > >> > I think you're just missing it:) atleast_1d/2d are used quite a bit in >> > Scipy and Statsmodels (those are the only ones I checked), and in the large >> > majority of cases it's the best thing to use there. There's a bunch of >> > atleast_2d calls with a transpose appended because the input needs to be >> > treated as columns instead of rows, but that's still efficient and readable >> > enough. >> >> I know people *use* it :-). What I'm confused about is in what situations >> you would invent it if it didn't exist. Can you point me to an example or >> two where it's "the best thing"? I actually had statsmodels in mind with my >> example of wanting the semantics "coerce 1d inputs into a column matrix; 0d >> or 3d inputs are an error". I'm surprised if there are places where you >> really want 0d arrays converted into 1x1, > > Scalar to shape (1,1) is less common, but 1-D to 2-D or scalar to shape (1,) > is very common. Example is at the top of scipy/stats/stats.py: the > _chk_asarray functions (used in many other functions) take care to never > return scalar arrays because those are plain annoying to deal with. If that > sounds weird to you, you're probably one of those people who was never > surprised by this: > > In [3]: x0 = np.array(1) > > In [4]: x1 = np.array([1]) > > In [5]: x0[0] > --- > IndexErrorTraceback (most recent call last) > in () > > 1 x0[0] > > IndexError: too many indices for array > > In [6]: x1[0] > Out[6]: 1 > >> or want to allow high dimensional arrays to pass through - and if you do >> want to allow high dimensional arrays to pass through, then transposing >> might help with 2d cases but will silently mangle high-d cases, right? > >>2d input handling is usually irrelevant. The vast majority of cases is >> "function that accepts scalar and 1-D array" or "function that accepts 1-D >> and 2-D arrays". Often such a function would want to convert inputs internally. > > Ralf > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion >
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Wed, Jul 6, 2016 at 6:26 PM, Nathaniel Smithwrote: On Jul 5, 2016 11:21 PM, "Ralf Gommers" wrote: > > > > > > > > On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith wrote: > > > >> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" < > jfoxrabinov...@gmail.com> wrote: > >> > > >> > Hi, > >> > > >> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a > >> > function np.atleast_nd in PR#7804 > >> > (https://github.com/numpy/numpy/pull/7804). > >> > > >> > As a result of this PR, I have a couple of questions about > >> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with > >> > the dimensions: If the input is 1D, it prepends and appends a size-1 > >> > dimension. If the input is 2D, it appends a size-1 dimension. This is > >> > inconsistent with `np.atleast_2d`, which always prepends (as does > >> > `np.atleast_nd`). > >> > > >> > - Is there any reason for this behavior? > >> > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in > >> > terms of `np.atleast_nd`, which is actually much simpler)? This would > >> > be a slight API change since the output would not be exactly the same. > >> > >> Changing atleast_3d seems likely to break a bunch of stuff... > >> > >> Beyond that, I find it hard to have an opinion about the best design > for these functions, because I don't think I've ever encountered a > situation where they were actually what I wanted. I'm not a big fan of > coercing dimensions in the first place, for the usual "refuse to guess" > reasons. And then generally if I do want to coerce an array to another > dimension, then I have some opinion about where the new dimensions should > go, and/or I have some opinion about the minimum acceptable starting > dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d > inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is > zero-for-three on that requirements list.) > >> > >> I don't know how typical I am in this. But it does make me wonder if > the atleast_* functions act as an attractive nuisance, where new users take > their presence as an implicit recommendation that they are actually a > useful thing to reach for, even though they... aren't that. And maybe we > should be recommending folk move away from them rather than trying to > extend them further? > >> > >> Or maybe they're totally useful and I'm just missing it. What's your > use case that motivates atleast_nd? > > > > I think you're just missing it:) atleast_1d/2d are used quite a bit in > Scipy and Statsmodels (those are the only ones I checked), and in the large > majority of cases it's the best thing to use there. There's a bunch of > atleast_2d calls with a transpose appended because the input needs to be > treated as columns instead of rows, but that's still efficient and readable > enough. > > I know people *use* it :-). What I'm confused about is in what situations > you would invent it if it didn't exist. Can you point me to an example or > two where it's "the best thing"? I actually had statsmodels in mind with my > example of wanting the semantics "coerce 1d inputs into a column matrix; 0d > or 3d inputs are an error". I'm surprised if there are places where you > really want 0d arrays converted into 1x1, > Scalar to shape (1,1) is less common, but 1-D to 2-D or scalar to shape (1,) is very common. Example is at the top of scipy/stats/stats.py: the _chk_asarray functions (used in many other functions) take care to never return scalar arrays because those are plain annoying to deal with. If that sounds weird to you, you're probably one of those people who was never surprised by this: In [3]: x0 = np.array(1) In [4]: x1 = np.array([1]) In [5]: x0[0] --- IndexErrorTraceback (most recent call last) in () > 1 x0[0] IndexError: too many indices for array In [6]: x1[0] Out[6]: 1 or want to allow high dimensional arrays to pass through - and if you do > want to allow high dimensional arrays to pass through, then transposing > might help with 2d cases but will silently mangle high-d cases, right? > >2d input handling is usually irrelevant. The vast majority of cases is "function that accepts scalar and 1-D array" or "function that accepts 1-D and 2-D arrays". Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements. On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz < jfoxrabinov...@gmail.com> wrote: > On Wed, Jul 6, 2016 at 2:57 PM, Eric Firingwrote: > > On 2016/07/06 8:25 AM, Benjamin Root wrote: > >> > >> I wouldn't have the keyword be "where", as that collides with the notion > >> of "where" elsewhere in numpy. > > > > > > Agreed. Maybe "side"? > > I have tentatively changed it to "pos". The reason that I don't like > "side" is that it implies only a subset of the possible ways that that > the position of the new dimensions can be specified. The current > implementation only puts things on one side or the other, but I have > considered also allowing an array of indices at which to place new > dimensions, and/or a dictionary keyed by the starting ndims. I do not > think "side" would be appropriate for these extended cases, even if > they are very unlikely to ever materialize. > > -Joe > > > (I find atleast_1d and atleast_2d to be very helpful for handling > inputs, as > > Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.) > > > > Eric > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Wed, Jul 6, 2016 at 2:57 PM, Eric Firingwrote: > On 2016/07/06 8:25 AM, Benjamin Root wrote: >> >> I wouldn't have the keyword be "where", as that collides with the notion >> of "where" elsewhere in numpy. > > > Agreed. Maybe "side"? I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize. -Joe > (I find atleast_1d and atleast_2d to be very helpful for handling inputs, as > Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.) > > Eric > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Wed, Jul 6, 2016 at 3:01 PM, Juan Nunez-Iglesiaswrote: > at_leastnd would be useful for nd image processing in a very analogous way > to how at_least2d is used by scikit-image, assuming it prepends. The > at_least3d choice is baffling, seems analogous to the 0.5-based indexing > presented at PyCon, and should be "fun" to deprecate. =P at_leastnd prepends by default, has an option to append instead and certainly does not 0.5-pend under any circumstances. `np.swapaxes` and `np.rollaxis` are there for a reason. If atleast_3d is deprecated because of its funky behavior, atleast_nd may be useful replacement. -Joe ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
at_leastnd would be useful for nd image processing in a very analogous way to how at_least2d is used by scikit-image, assuming it prepends. The at_least3d choice is baffling, seems analogous to the 0.5-based indexing presented at PyCon, and should be "fun" to deprecate. =P On 6 July 2016 at 2:57:57 PM, Eric Firing (efir...@hawaii.edu) wrote: On 2016/07/06 8:25 AM, Benjamin Root wrote: > I wouldn't have the keyword be "where", as that collides with the notion > of "where" elsewhere in numpy. Agreed. Maybe "side"? (I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.) Eric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On 2016/07/06 8:25 AM, Benjamin Root wrote: I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy. Agreed. Maybe "side"? (I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.) Eric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
Agreed. I was originally going with "side", but I want something that can be changed to accepting arbitrary specs without changing the word. Perhaps "pos"? I am open to suggestion. -Joe On Wed, Jul 6, 2016 at 2:25 PM, Benjamin Rootwrote: > I wouldn't have the keyword be "where", as that collides with the notion of > "where" elsewhere in numpy. > > On Wed, Jul 6, 2016 at 2:21 PM, Joseph Fox-Rabinovitz > wrote: >> >> I still think this function is useful. I have made a change so that it >> only accepts one array, as Marten suggested, making the API much >> cleaner than that of its siblings. The side on which the new >> dimensions will be added is configurable via the `where` parameter, >> which currently accepts 'before' and 'after', but can be changed to >> accept sequences or even dicts. The change also resulted in finding a >> bug in the masked array versions of the atleast functions, which the >> PR now fixes and adds regression tests for. If the devs do decide to >> discard this PR, I will of course submit the bug fix separately. >> >> -Joe >> >> On Wed, Jul 6, 2016 at 1:43 PM, Stephan Hoyer wrote: >> > On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith wrote: >> >> >> >> I don't know how typical I am in this. But it does make me wonder if >> >> the >> >> atleast_* functions act as an attractive nuisance, where new users take >> >> their presence as an implicit recommendation that they are actually a >> >> useful >> >> thing to reach for, even though they... aren't that. And maybe we >> >> should be >> >> recommending folk move away from them rather than trying to extend them >> >> further? >> > >> > Agreed. I would avoid adding atleast_nd. We could discourage using >> > atleast_3d (certainly the behavior is indeed surprising), but I'm not >> > sure >> > it's worth the trouble. >> > >> > ___ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy. On Wed, Jul 6, 2016 at 2:21 PM, Joseph Fox-Rabinovitz < jfoxrabinov...@gmail.com> wrote: > I still think this function is useful. I have made a change so that it > only accepts one array, as Marten suggested, making the API much > cleaner than that of its siblings. The side on which the new > dimensions will be added is configurable via the `where` parameter, > which currently accepts 'before' and 'after', but can be changed to > accept sequences or even dicts. The change also resulted in finding a > bug in the masked array versions of the atleast functions, which the > PR now fixes and adds regression tests for. If the devs do decide to > discard this PR, I will of course submit the bug fix separately. > > -Joe > > On Wed, Jul 6, 2016 at 1:43 PM, Stephan Hoyerwrote: > > On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith wrote: > >> > >> I don't know how typical I am in this. But it does make me wonder if the > >> atleast_* functions act as an attractive nuisance, where new users take > >> their presence as an implicit recommendation that they are actually a > useful > >> thing to reach for, even though they... aren't that. And maybe we > should be > >> recommending folk move away from them rather than trying to extend them > >> further? > > > > Agreed. I would avoid adding atleast_nd. We could discourage using > > atleast_3d (certainly the behavior is indeed surprising), but I'm not > sure > > it's worth the trouble. > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
I still think this function is useful. I have made a change so that it only accepts one array, as Marten suggested, making the API much cleaner than that of its siblings. The side on which the new dimensions will be added is configurable via the `where` parameter, which currently accepts 'before' and 'after', but can be changed to accept sequences or even dicts. The change also resulted in finding a bug in the masked array versions of the atleast functions, which the PR now fixes and adds regression tests for. If the devs do decide to discard this PR, I will of course submit the bug fix separately. -Joe On Wed, Jul 6, 2016 at 1:43 PM, Stephan Hoyerwrote: > On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith wrote: >> >> I don't know how typical I am in this. But it does make me wonder if the >> atleast_* functions act as an attractive nuisance, where new users take >> their presence as an implicit recommendation that they are actually a useful >> thing to reach for, even though they... aren't that. And maybe we should be >> recommending folk move away from them rather than trying to extend them >> further? > > Agreed. I would avoid adding atleast_nd. We could discourage using > atleast_3d (certainly the behavior is indeed surprising), but I'm not sure > it's worth the trouble. > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smithwrote: > I don't know how typical I am in this. But it does make me wonder if the > atleast_* functions act as an attractive nuisance, where new users take > their presence as an implicit recommendation that they are actually a > useful thing to reach for, even though they... aren't that. And maybe we > should be recommending folk move away from them rather than trying to > extend them further? > Agreed. I would avoid adding atleast_nd. We could discourage using atleast_3d (certainly the behavior is indeed surprising), but I'm not sure it's worth the trouble. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
I was using "reduce" in an abstract sense. I put in a 4D array in, get a 1-3D array out, depending on some other parameters (not strictly just by reduction, although that is the net effect). The placement of the dimensions is irrelevant, I just need to make the output 4D again for further calculations. Since I can have cases where the output is of different number of dims, I wrote this function as a handy tool to avoid conditionals. I realize that this is not a common use-case, but it seemed like a thing someone else might find useful one day. -Joe On Wed, Jul 6, 2016 at 12:35 PM, Nathaniel Smithwrote: > On Jul 6, 2016 6:12 AM, "Joseph Fox-Rabinovitz" > wrote: >> >> I can add a keyword-only argument that lets you put the new dims >> before or after the existing ones. I am not sure how to specify >> arbitrary patterns for the new dimensions, but that should take care >> of most use cases. >> >> The use case that motivated this function in the first place is that I >> am doing some processing on 4D arrays and I need to reduce them but >> return a result with the original dimensionality (but not shape). >> atleast_nd seemed like a better solution than atleast_4d. > > This is a tangent that might not apply given the details of your code, but > isn't this what keepdims is for? (And keepdims has the huge advantage that > it knows which axes are being reduced and thus where to put the new axes.) > > I guess even if I couldn't use keepdims for some reason, my inclination > would be to try to emulate it by fixing up the axes as I went, because I'd > find it easier to verify that I hadn't accidentally misaligned things if the > reductions and fix-ups were local to each other, and explicit axis > insertions are much easier than trying to remember whether atleast_nd > prepends or appends. This of course is all based on some vague guess at what > your code actually looks like though... > > -n > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Jul 6, 2016 6:12 AM, "Joseph Fox-Rabinovitz"wrote: > > I can add a keyword-only argument that lets you put the new dims > before or after the existing ones. I am not sure how to specify > arbitrary patterns for the new dimensions, but that should take care > of most use cases. > > The use case that motivated this function in the first place is that I > am doing some processing on 4D arrays and I need to reduce them but > return a result with the original dimensionality (but not shape). > atleast_nd seemed like a better solution than atleast_4d. This is a tangent that might not apply given the details of your code, but isn't this what keepdims is for? (And keepdims has the huge advantage that it knows which axes are being reduced and thus where to put the new axes.) I guess even if I couldn't use keepdims for some reason, my inclination would be to try to emulate it by fixing up the axes as I went, because I'd find it easier to verify that I hadn't accidentally misaligned things if the reductions and fix-ups were local to each other, and explicit axis insertions are much easier than trying to remember whether atleast_nd prepends or appends. This of course is all based on some vague guess at what your code actually looks like though... -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Jul 5, 2016 11:21 PM, "Ralf Gommers"wrote: > > > > On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith wrote: > >> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" wrote: >> > >> > Hi, >> > >> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a >> > function np.atleast_nd in PR#7804 >> > (https://github.com/numpy/numpy/pull/7804). >> > >> > As a result of this PR, I have a couple of questions about >> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with >> > the dimensions: If the input is 1D, it prepends and appends a size-1 >> > dimension. If the input is 2D, it appends a size-1 dimension. This is >> > inconsistent with `np.atleast_2d`, which always prepends (as does >> > `np.atleast_nd`). >> > >> > - Is there any reason for this behavior? >> > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in >> > terms of `np.atleast_nd`, which is actually much simpler)? This would >> > be a slight API change since the output would not be exactly the same. >> >> Changing atleast_3d seems likely to break a bunch of stuff... >> >> Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.) >> >> I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further? >> >> Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd? > > I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough. I know people *use* it :-). What I'm confused about is in what situations you would invent it if it didn't exist. Can you point me to an example or two where it's "the best thing"? I actually had statsmodels in mind with my example of wanting the semantics "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error". I'm surprised if there are places where you really want 0d arrays converted into 1x1, or want to allow high dimensional arrays to pass through - and if you do want to allow high dimensional arrays to pass through, then transposing might help with 2d cases but will silently mangle high-d cases, right? -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Mi, 2016-07-06 at 10:22 -0400, Marten van Kerkwijk wrote: > Hi All, > > I'm with Nathaniel here, in that I don't really see the point of > these routines in the first place: broadcasting takes care of many of > the initial use cases one might think of, and others are generally > not all that well served by them: the examples from scipy to me do > not really support `at_least?d`, but rather suggest that little > thought has been put into higher-dimensional objects which should be > treated as stacks of row or column vectors. My sense is that we're > better off developing the direction started with `matmul`, perhaps > adding `matvecmul` etc. > > More to the point of the initial inquiry: what is the advantage of > having a general `np.at_leastnd` routine over doing There is another wonky reason for using the atleast_?d functions, in that they use reshape to be fully duck typed ;) (in newer versions at least, probably mostly for sparse arrays, not sure). Tend to agree though, especially considering the confusing order of 3d, which I suppose is likely due to some linalg considerations. Of course you could supply something like an insertion order of (1, 0, 2) to denote the current 3D case in the nd one, but frankly it seems to me likely harder to understand how it works then to write your own functions to just do it. Scipy uses the 3D case exactly never (once in a test). I have my doubts many would notice if we deprecate the 3D case, but then it is likely more trouble then gain. - Sebastian > ``` > np.array(a, copy=False, ndim=n) > ``` > or, for a list of inputs, > ``` > [np.array(a, copy=False, ndim=n) for a in input_list] > ``` > > All the best, > > Marten > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
We use np.at_least2d extensively in scikit-image, and I also use it in a *lot* of my own code now that scikit-learn stopped accepting 1D arrays as feature vectors. > what is the advantage of np.at_leastnd` over `np.array(a, copy=False, ndim=n)` Readability, clearly. My only concern is the described behavior of np.at_least3d, which came as a surprise. I certainly would expect the “at_least” family to all work in the same way as broadcasting, ie prepending singleton dimensions. Prepend/append behavior can be controlled either by keyword or simply by using .T, I don’t mind either way. Juan. On 6 July 2016 at 10:22:15 AM, Marten van Kerkwijk ( m.h.vankerkw...@gmail.com) wrote: Hi All, I'm with Nathaniel here, in that I don't really see the point of these routines in the first place: broadcasting takes care of many of the initial use cases one might think of, and others are generally not all that well served by them: the examples from scipy to me do not really support `at_least?d`, but rather suggest that little thought has been put into higher-dimensional objects which should be treated as stacks of row or column vectors. My sense is that we're better off developing the direction started with `matmul`, perhaps adding `matvecmul` etc. More to the point of the initial inquiry: what is the advantage of having a general `np.at_leastnd` routine over doing ``` np.array(a, copy=False, ndim=n) ``` or, for a list of inputs, ``` [np.array(a, copy=False, ndim=n) for a in input_list] ``` All the best, Marten ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
Hi All, I'm with Nathaniel here, in that I don't really see the point of these routines in the first place: broadcasting takes care of many of the initial use cases one might think of, and others are generally not all that well served by them: the examples from scipy to me do not really support `at_least?d`, but rather suggest that little thought has been put into higher-dimensional objects which should be treated as stacks of row or column vectors. My sense is that we're better off developing the direction started with `matmul`, perhaps adding `matvecmul` etc. More to the point of the initial inquiry: what is the advantage of having a general `np.at_leastnd` routine over doing ``` np.array(a, copy=False, ndim=n) ``` or, for a list of inputs, ``` [np.array(a, copy=False, ndim=n) for a in input_list] ``` All the best, Marten ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
While atleast_1d/2d/3d predates my involvement in numpy, I am probably partly to blame for popularizing them as I helped to fix them up a fair amount. I wouldn't call its use "guessing". Rather, I would treat them as useful input sanitizers. If your function is going to be doing 2d indexing on an input, then it is very convenient to have atleast_2d() at the top of your function, not only to sanitize the input, but to make it clear that your code expects at least two dimensions. One place where it is used is in np.loadtxt(..., ndmin=N) to protect against the situation of a single row of data becoming a 1-D array rather than a 2-D array (or an empty text file returning something completely useless). I have previously pointed out the oddity with atleast_3d(). I can't remember the explanation I got though. Maybe someone can find the old thread that has the explanation, if any? I think the keyword argument approach for controlling the behavior might be a good approach, provided that a suitable design could be devised. 1 & 2 dimensions is fairly trivial to control, but 3+ dimensions has too many degrees of freedom for me to consider. Cheers! Ben Root On Wed, Jul 6, 2016 at 9:12 AM, Joseph Fox-Rabinovitz < jfoxrabinov...@gmail.com> wrote: > I can add a keyword-only argument that lets you put the new dims > before or after the existing ones. I am not sure how to specify > arbitrary patterns for the new dimensions, but that should take care > of most use cases. > > The use case that motivated this function in the first place is that I > am doing some processing on 4D arrays and I need to reduce them but > return a result with the original dimensionality (but not shape). > atleast_nd seemed like a better solution than atleast_4d. > > -Joe > > > On Wed, Jul 6, 2016 at 3:41 AM,wrote: > > > > > > On Wed, Jul 6, 2016 at 3:29 AM, wrote: > >> > >> > >> > >> On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers > >> wrote: > >>> > >>> > >>> > >>> On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith wrote: > >>> > On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" > wrote: > > > > Hi, > > > > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with > a > > function np.atleast_nd in PR#7804 > > (https://github.com/numpy/numpy/pull/7804). > > > > As a result of this PR, I have a couple of questions about > > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with > > the dimensions: If the input is 1D, it prepends and appends a size-1 > > dimension. If the input is 2D, it appends a size-1 dimension. This > is > > inconsistent with `np.atleast_2d`, which always prepends (as does > > `np.atleast_nd`). > > > > - Is there any reason for this behavior? > > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in > > terms of `np.atleast_nd`, which is actually much simpler)? This > would > > be a slight API change since the output would not be exactly the > same. > > Changing atleast_3d seems likely to break a bunch of stuff... > > Beyond that, I find it hard to have an opinion about the best design > for > these functions, because I don't think I've ever encountered a > situation > where they were actually what I wanted. I'm not a big fan of coercing > dimensions in the first place, for the usual "refuse to guess" > reasons. And > then generally if I do want to coerce an array to another dimension, > then I > have some opinion about where the new dimensions should go, and/or I > have > some opinion about the minimum acceptable starting dimension, and/or > I have > a maximum dimension in mind. (E.g. "coerce 1d inputs into a column > matrix; > 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that > requirements list.) > > I don't know how typical I am in this. But it does make me wonder if > the > atleast_* functions act as an attractive nuisance, where new users > take > their presence as an implicit recommendation that they are actually a > useful > thing to reach for, even though they... aren't that. And maybe we > should be > recommending folk move away from them rather than trying to extend > them > further? > > Or maybe they're totally useful and I'm just missing it. What's your > use > case that motivates atleast_nd? > >>> > >>> I think you're just missing it:) atleast_1d/2d are used quite a bit in > >>> Scipy and Statsmodels (those are the only ones I checked), and in the > large > >>> majority of cases it's the best thing to use there. There's a bunch of > >>> atleast_2d calls with a transpose appended because the input needs to > be > >>> treated as columns instead of rows, but that's still efficient and > readable > >>> enough. > >> > >> > >> > >>
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
I can add a keyword-only argument that lets you put the new dims before or after the existing ones. I am not sure how to specify arbitrary patterns for the new dimensions, but that should take care of most use cases. The use case that motivated this function in the first place is that I am doing some processing on 4D arrays and I need to reduce them but return a result with the original dimensionality (but not shape). atleast_nd seemed like a better solution than atleast_4d. -Joe On Wed, Jul 6, 2016 at 3:41 AM,wrote: > > > On Wed, Jul 6, 2016 at 3:29 AM, wrote: >> >> >> >> On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers >> wrote: >>> >>> >>> >>> On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith wrote: >>> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" wrote: > > Hi, > > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a > function np.atleast_nd in PR#7804 > (https://github.com/numpy/numpy/pull/7804). > > As a result of this PR, I have a couple of questions about > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with > the dimensions: If the input is 1D, it prepends and appends a size-1 > dimension. If the input is 2D, it appends a size-1 dimension. This is > inconsistent with `np.atleast_2d`, which always prepends (as does > `np.atleast_nd`). > > - Is there any reason for this behavior? > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in > terms of `np.atleast_nd`, which is actually much simpler)? This would > be a slight API change since the output would not be exactly the same. Changing atleast_3d seems likely to break a bunch of stuff... Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.) I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further? Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd? >>> >>> I think you're just missing it:) atleast_1d/2d are used quite a bit in >>> Scipy and Statsmodels (those are the only ones I checked), and in the large >>> majority of cases it's the best thing to use there. There's a bunch of >>> atleast_2d calls with a transpose appended because the input needs to be >>> treated as columns instead of rows, but that's still efficient and readable >>> enough. >> >> >> >> As Ralph pointed out its usage in statsmodels. I do find them useful as >> replacement for several lines of ifs and reshapes >> >> We stilll need in many cases the atleast_2d_cols, that appends the newaxis >> if necessary. >> >> roughly the equivalent of >> >> if x.ndim == 1: >> x = x[:, None] >> else: >> x = np.atleast_2d(x) >> >> Josef >> >>> >>> >>> For 3D/nD I can see that you'd need more control over where the >>> dimensions go, but 1D/2D are fine. > > > > statsmodels has currently very little code with ndim >2, so I have no > overview of possible use cases, but it would be necessary to have full > control over the added axis since axis have a strict meaning and stats still > prefer Fortran order to default numpy/C ordering. > > Josef > > >>> >>> >>> >>> Ralf >>> >>> >>> ___ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Wed, Jul 6, 2016 at 3:29 AM,wrote: > > > On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers > wrote: > >> >> >> On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith wrote: >> >> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" >>> wrote: >>> > >>> > Hi, >>> > >>> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a >>> > function np.atleast_nd in PR#7804 >>> > (https://github.com/numpy/numpy/pull/7804). >>> > >>> > As a result of this PR, I have a couple of questions about >>> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with >>> > the dimensions: If the input is 1D, it prepends and appends a size-1 >>> > dimension. If the input is 2D, it appends a size-1 dimension. This is >>> > inconsistent with `np.atleast_2d`, which always prepends (as does >>> > `np.atleast_nd`). >>> > >>> > - Is there any reason for this behavior? >>> > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in >>> > terms of `np.atleast_nd`, which is actually much simpler)? This would >>> > be a slight API change since the output would not be exactly the same. >>> >>> Changing atleast_3d seems likely to break a bunch of stuff... >>> >>> Beyond that, I find it hard to have an opinion about the best design for >>> these functions, because I don't think I've ever encountered a situation >>> where they were actually what I wanted. I'm not a big fan of coercing >>> dimensions in the first place, for the usual "refuse to guess" reasons. And >>> then generally if I do want to coerce an array to another dimension, then I >>> have some opinion about where the new dimensions should go, and/or I have >>> some opinion about the minimum acceptable starting dimension, and/or I have >>> a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; >>> 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that >>> requirements list.) >>> >>> I don't know how typical I am in this. But it does make me wonder if the >>> atleast_* functions act as an attractive nuisance, where new users take >>> their presence as an implicit recommendation that they are actually a >>> useful thing to reach for, even though they... aren't that. And maybe we >>> should be recommending folk move away from them rather than trying to >>> extend them further? >>> >>> Or maybe they're totally useful and I'm just missing it. What's your use >>> case that motivates atleast_nd? >>> >> I think you're just missing it:) atleast_1d/2d are used quite a bit in >> Scipy and Statsmodels (those are the only ones I checked), and in the large >> majority of cases it's the best thing to use there. There's a bunch of >> atleast_2d calls with a transpose appended because the input needs to be >> treated as columns instead of rows, but that's still efficient and readable >> enough. >> > > > As Ralph pointed out its usage in statsmodels. I do find them useful as > replacement for several lines of ifs and reshapes > > We stilll need in many cases the atleast_2d_cols, that appends the newaxis > if necessary. > > roughly the equivalent of > > if x.ndim == 1: > x = x[:, None] > else: > x = np.atleast_2d(x) > > Josef > > >> >> For 3D/nD I can see that you'd need more control over where the >> dimensions go, but 1D/2D are fine. >> > statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats still prefer Fortran order to default numpy/C ordering. Josef > >> >> Ralf >> >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommerswrote: > > > On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith wrote: > > On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" >> wrote: >> > >> > Hi, >> > >> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a >> > function np.atleast_nd in PR#7804 >> > (https://github.com/numpy/numpy/pull/7804). >> > >> > As a result of this PR, I have a couple of questions about >> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with >> > the dimensions: If the input is 1D, it prepends and appends a size-1 >> > dimension. If the input is 2D, it appends a size-1 dimension. This is >> > inconsistent with `np.atleast_2d`, which always prepends (as does >> > `np.atleast_nd`). >> > >> > - Is there any reason for this behavior? >> > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in >> > terms of `np.atleast_nd`, which is actually much simpler)? This would >> > be a slight API change since the output would not be exactly the same. >> >> Changing atleast_3d seems likely to break a bunch of stuff... >> >> Beyond that, I find it hard to have an opinion about the best design for >> these functions, because I don't think I've ever encountered a situation >> where they were actually what I wanted. I'm not a big fan of coercing >> dimensions in the first place, for the usual "refuse to guess" reasons. And >> then generally if I do want to coerce an array to another dimension, then I >> have some opinion about where the new dimensions should go, and/or I have >> some opinion about the minimum acceptable starting dimension, and/or I have >> a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; >> 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that >> requirements list.) >> >> I don't know how typical I am in this. But it does make me wonder if the >> atleast_* functions act as an attractive nuisance, where new users take >> their presence as an implicit recommendation that they are actually a >> useful thing to reach for, even though they... aren't that. And maybe we >> should be recommending folk move away from them rather than trying to >> extend them further? >> >> Or maybe they're totally useful and I'm just missing it. What's your use >> case that motivates atleast_nd? >> > I think you're just missing it:) atleast_1d/2d are used quite a bit in > Scipy and Statsmodels (those are the only ones I checked), and in the large > majority of cases it's the best thing to use there. There's a bunch of > atleast_2d calls with a transpose appended because the input needs to be > treated as columns instead of rows, but that's still efficient and readable > enough. > As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary. roughly the equivalent of if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x) Josef > > For 3D/nD I can see that you'd need more control over where the dimensions > go, but 1D/2D are fine. > > Ralf > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smithwrote: On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" > wrote: > > > > Hi, > > > > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a > > function np.atleast_nd in PR#7804 > > (https://github.com/numpy/numpy/pull/7804). > > > > As a result of this PR, I have a couple of questions about > > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with > > the dimensions: If the input is 1D, it prepends and appends a size-1 > > dimension. If the input is 2D, it appends a size-1 dimension. This is > > inconsistent with `np.atleast_2d`, which always prepends (as does > > `np.atleast_nd`). > > > > - Is there any reason for this behavior? > > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in > > terms of `np.atleast_nd`, which is actually much simpler)? This would > > be a slight API change since the output would not be exactly the same. > > Changing atleast_3d seems likely to break a bunch of stuff... > > Beyond that, I find it hard to have an opinion about the best design for > these functions, because I don't think I've ever encountered a situation > where they were actually what I wanted. I'm not a big fan of coercing > dimensions in the first place, for the usual "refuse to guess" reasons. And > then generally if I do want to coerce an array to another dimension, then I > have some opinion about where the new dimensions should go, and/or I have > some opinion about the minimum acceptable starting dimension, and/or I have > a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; > 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that > requirements list.) > > I don't know how typical I am in this. But it does make me wonder if the > atleast_* functions act as an attractive nuisance, where new users take > their presence as an implicit recommendation that they are actually a > useful thing to reach for, even though they... aren't that. And maybe we > should be recommending folk move away from them rather than trying to > extend them further? > > Or maybe they're totally useful and I'm just missing it. What's your use > case that motivates atleast_nd? > I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough. For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz"wrote: > > Hi, > > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a > function np.atleast_nd in PR#7804 > (https://github.com/numpy/numpy/pull/7804). > > As a result of this PR, I have a couple of questions about > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with > the dimensions: If the input is 1D, it prepends and appends a size-1 > dimension. If the input is 2D, it appends a size-1 dimension. This is > inconsistent with `np.atleast_2d`, which always prepends (as does > `np.atleast_nd`). > > - Is there any reason for this behavior? > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in > terms of `np.atleast_nd`, which is actually much simpler)? This would > be a slight API change since the output would not be exactly the same. Changing atleast_3d seems likely to break a bunch of stuff... Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.) I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further? Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd? -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion