Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-18 Thread Oscar Benjamin
On Thu, 18 Feb 2021 at 10:11, Ralf Gommers  wrote:
>
>
>
> On Wed, Feb 17, 2021 at 9:26 PM Oscar Benjamin  
> wrote:
>>
>> On Wed, 17 Feb 2021 at 10:36, Ralf Gommers  wrote:
>> >
>> > On Wed, Feb 17, 2021 at 12:26 AM Stefan van der Walt 
>> >  wrote:
>> >>
>> >> Ralf has been working towards this idea, but having a well-organised 
>> >> namespace of utility functions outside of the core NumPy API would be 
>> >> helpful in allowing expansion and experimentation, without making the 
>> >> current situation worse (where we effectively have to support things 
>> >> forever).  As an example, take Cartesian product [0] and array 
>> >> combinations [1], which have been requested several times on 
>> >> StackOverflow, but there's nowhere to put them.
>> >
>> > This is a good point. If we could put it in `numpy.lib` without it 
>> > bleeding into the main namespace, saying yes here would be easier. Maybe 
>> > we can give it a conditional yes based on that namespace reorganization?
>>
>> As an aside is this numpy.lib idea explained anywhere?
>
>
> It isn't, but it's relatively straightforward and can be done without 
> thinking about the issues around our other namespaces. Basically:
> - today `numpy.lib` is a public but fairly useless namespace, because its 
> contents get star-imported to the main namespace; only subsubmodules like 
> `numpy.lib.stride_tricks` are separate

Okay, that's a bit different from what I was thinking of for sympy.
The problem for sympy is that everything is either in the top-level
sympy namespace or is just directly imported from the module where it
is defined. That means that there is no proper separation between
public and private apart from being in the top-level namespace which
is already bloated on the one hand and incomplete on the other since
we obviously can't put *everything* there.

Even something is simple as deleting a no longer needed internal
function or renaming an "internal" module is potentially problematic
in sympy. I was thinking about having a sympy.public module (and
submodules) and documenting that as the expected public interface for
importing *anything* from sympy. Potentially that could be called
sympy.lib which would seem consistent with numpy although having the
same name could be problematic if the intent is not necessarily the
same.

> - we want to stop this star-importing, which required some tedious work of 
> fixing up how we handle __all__ dicts in addition to making exports explicit
> - then, we would like to use `numpy.lib` as a namespace for utilities and 
> assorted functionality that people seem to want, but does not meet the bar 
> for the main namespace and doesn't fit in our other decent namespace (fft, 
> linalg, random, polynomial, f2py, distutils).
> - TBD if there should be subsubmodules under `numpy.lib` or not
> - it should be explicitly documented that this is a "lower bar namespace" and 
> that we discourage other array/tensor libraries from copying its API
>
> We had a good discussion about this in the community meeting yesterday. 
> Sebastian volunteered to sort out the star-import issue.

I already removed all the star-imports from sympy which was somewhat tedious.

Sebastian you might be interested in the script I wrote below. It
extracts all of the star-imported names from a module and formats the
__all__ and import lines for the __init__.py file. I used it to create
e.g. this:
https://github.com/sympy/sympy/blob/master/sympy/__init__.py#L51-L491
I think that flake8 spots if the import list and the __all__ get out
of sync so it's not so hard to maintain later on.

You just tell the script what package the __init__.py is and what
submodules to import like:

$ my/fmt_imports.py numpy.lib type_check index_tricks
__all__ = [
'iscomplexobj', 'isrealobj', 'imag', 'iscomplex', 'isreal', 'nan_to_num',
'real', 'real_if_close', 'typename', 'asfarray', 'mintypecode',
'asscalar', 'common_type',

'ravel_multi_index', 'unravel_index', 'mgrid', 'ogrid', 'r_', 'c_', 's_',
'index_exp', 'ix_', 'ndenumerate', 'ndindex', 'fill_diagonal',
'diag_indices', 'diag_indices_from',
]
from .type_check import (iscomplexobj, isrealobj, imag, iscomplex, isreal,
nan_to_num, real, real_if_close, typename, asfarray, mintypecode,
asscalar, common_type)

from .index_tricks import (ravel_multi_index, unravel_index, mgrid, ogrid, r_,
c_, s_, index_exp, ix_, ndenumerate, ndindex, fill_diagonal,
diag_indices, diag_indices_from)


The script is:

#!/usr/bin/env python

from __future__ import print_function
from importlib import import_module

import __future__
future_imports = dir(__future__)

def main(pkgname, *submodules):
imports = find_imports(pkgname, submodules)
pretty_all(imports, submodules)
pretty_imports(imports, submodules)

def find_imports(pkgname, submodules):
imports = {}
for modname in submodules:
modpath = pkgname + '.' + modname
mod = import_module(modpath)
   

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-18 Thread Ralf Gommers
On Wed, Feb 17, 2021 at 9:26 PM Oscar Benjamin 
wrote:

> On Wed, 17 Feb 2021 at 10:36, Ralf Gommers  wrote:
> >
> > On Wed, Feb 17, 2021 at 12:26 AM Stefan van der Walt <
> stef...@berkeley.edu> wrote:
> >>
> >> Ralf has been working towards this idea, but having a well-organised
> namespace of utility functions outside of the core NumPy API would be
> helpful in allowing expansion and experimentation, without making the
> current situation worse (where we effectively have to support things
> forever).  As an example, take Cartesian product [0] and array combinations
> [1], which have been requested several times on StackOverflow, but there's
> nowhere to put them.
> >
> > This is a good point. If we could put it in `numpy.lib` without it
> bleeding into the main namespace, saying yes here would be easier. Maybe we
> can give it a conditional yes based on that namespace reorganization?
>
> As an aside is this numpy.lib idea explained anywhere?
>

It isn't, but it's relatively straightforward and can be done without
thinking about the issues around our other namespaces. Basically:
- today `numpy.lib` is a public but fairly useless namespace, because its
contents get star-imported to the main namespace; only subsubmodules like
`numpy.lib.stride_tricks` are separate
- we want to stop this star-importing, which required some tedious work of
fixing up how we handle __all__ dicts in addition to making exports explicit
- then, we would like to use `numpy.lib` as a namespace for utilities and
assorted functionality that people seem to want, but does not meet the bar
for the main namespace and doesn't fit in our other decent namespace (fft,
linalg, random, polynomial, f2py, distutils).
- TBD if there should be subsubmodules under `numpy.lib` or not
- it should be explicitly documented that this is a "lower bar namespace"
and that we discourage other array/tensor libraries from copying its API

We had a good discussion about this in the community meeting yesterday.
Sebastian volunteered to sort out the star-import issue.


> I've been thinking about something possibly similar for sympy which
> also has a bloated top-level namespace (and has no other place for
> public API to go).
>

A larger plan for cleaning up main namespace bloat, as well as dealing with
our unmaintained namespaces (numpy.dual, numpy.emath, etc.) is still needed.

Cheers,
Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-18 Thread Stephan Hoyer
On Wed, Feb 17, 2021 at 2:37 AM Ralf Gommers  wrote:

>
>
> On Wed, Feb 17, 2021 at 12:26 AM Stefan van der Walt 
> wrote:
>
>> On Tue, Feb 16, 2021, at 07:49, Joseph Fox-Rabinovitz wrote:
>>
>> I'm getting a generally lukewarm not negative response. Should we put it
>> to a vote?
>>
>>
>> Things here don't typically get decided by vote—I think you'll have to
>> build towards consensus.  It may be overkill to write a NEP, but outlining
>> a proposed solution along with pros and cons and getting everyone on board
>> is necessary.
>>
>> The API surface is a touchy issue, and so it is difficult to get new
>> features like these added.
>>
>
> This function is less bad than most similar utility functions, because it
> starts with atleast_ so from a "function browsing" end user perspective
> it's not much additional clutter. But it does still force other libraries
> to do work because they aim to be compatible to numpy's main namespace
> (e.g. see jax.numpy).
>
> And there's 6-7 maintainers all not strongly opposed but also not
> enthusiastic.
>

I agree with Ralf's assessment.

This function feels like a natural generalization of existing NumPy
functionality, but we don't expand NumPy's API without use-cases. That's
just a waste of time for everyone involved.

I am most moved by Juan's report that he has the "very distinct impression
of needing it repeatedly," but I would still love to see concrete examples
of where users have found this be helpful.

It is not a hard function to write, so if it was useful I would expect to
see some version of it in an existing open source project or at least on
StackOverflow.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-17 Thread Oscar Benjamin
On Wed, 17 Feb 2021 at 10:36, Ralf Gommers  wrote:
>
> On Wed, Feb 17, 2021 at 12:26 AM Stefan van der Walt  
> wrote:
>>
>> Ralf has been working towards this idea, but having a well-organised 
>> namespace of utility functions outside of the core NumPy API would be 
>> helpful in allowing expansion and experimentation, without making the 
>> current situation worse (where we effectively have to support things 
>> forever).  As an example, take Cartesian product [0] and array combinations 
>> [1], which have been requested several times on StackOverflow, but there's 
>> nowhere to put them.
>
> This is a good point. If we could put it in `numpy.lib` without it bleeding 
> into the main namespace, saying yes here would be easier. Maybe we can give 
> it a conditional yes based on that namespace reorganization?

As an aside is this numpy.lib idea explained anywhere?

I've been thinking about something possibly similar for sympy which
also has a bloated top-level namespace (and has no other place for
public API to go).


Oscar
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-17 Thread Ralf Gommers
On Wed, Feb 17, 2021 at 12:26 AM Stefan van der Walt 
wrote:

> On Tue, Feb 16, 2021, at 07:49, Joseph Fox-Rabinovitz wrote:
>
> I'm getting a generally lukewarm not negative response. Should we put it
> to a vote?
>
>
> Things here don't typically get decided by vote—I think you'll have to
> build towards consensus.  It may be overkill to write a NEP, but outlining
> a proposed solution along with pros and cons and getting everyone on board
> is necessary.
>
> The API surface is a touchy issue, and so it is difficult to get new
> features like these added.
>

This function is less bad than most similar utility functions, because it
starts with atleast_ so from a "function browsing" end user perspective
it's not much additional clutter. But it does still force other libraries
to do work because they aim to be compatible to numpy's main namespace
(e.g. see jax.numpy).

And there's 6-7 maintainers all not strongly opposed but also not
enthusiastic.


> Ralf has been working towards this idea, but having a well-organised
> namespace of utility functions outside of the core NumPy API would be
> helpful in allowing expansion and experimentation, without making the
> current situation worse (where we effectively have to support things
> forever).  As an example, take Cartesian product [0] and array combinations
> [1], which have been requested several times on StackOverflow, but there's
> nowhere to put them.
>

This is a good point. If we could put it in `numpy.lib` without it bleeding
into the main namespace, saying yes here would be easier. Maybe we can give
it a conditional yes based on that namespace reorganization?

Cheers,
Ralf


> Stéfan
>
> [0]
> https://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays#comment22769580_1235363
>
> [1]
> https://stackoverflow.com/questions/16003217/n-d-version-of-itertools-combinations-in-numpy/16008578#16008578
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-16 Thread Joseph Fox-Rabinovitz
I'm getting a generally lukewarm not negative response. Should we put it to
a vote?

- Joe

On Fri, Feb 12, 2021, 16:06 Robert Kern  wrote:

> On Fri, Feb 12, 2021 at 3:42 PM Ralf Gommers 
> wrote:
>
>>
>> On Fri, Feb 12, 2021 at 9:21 PM Robert Kern 
>> wrote:
>>
>>> On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers 
>>> wrote:
>>>

 On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg <
 sebast...@sipsolutions.net> wrote:

>
> Right, my initial feeling it that without such context `atleast_3d` is
> pretty surprising.  So I wonder if we can design `atleast_nd` in a way
> that it is explicit about this context.
>

 Agreed. I think such a use case is probably too specific to design a
 single function for, at least in such a hardcoded way.

>>>
>>> That might be an argument for not designing a new one (or at least not
>>> giving it such a name). Not sure it's a good argument for removing a
>>> long-standing one.
>>>
>>
>> I agree. I'm not sure deprecating is best. But introducing new
>> functionality where `nd(pos=3) != 3d` is also not great.
>>
>> At the very least, atleast_3d should be better documented. It also is
>> telling that Juan (a long-time) scikit-image dev doesn't like atleast_3d
>> and there's very little usage of it in scikit-image.
>>
>
> I'm fairly neutral on atleast_nd(). I think that for n=1 and n=2, you can
> derive The One Way to Do It from broadcasting semantics, but for n>=3, I'm
> not sure there's much value in trying to systematize it to a single
> convention. I think that once you get up to those dimensions, you start to
> want to have domain-specific semantics. I do agree that, in retrospect,
> atleast_3d() probably should have been named more specifically. It was of a
> piece of other conveniences like dstack() that did special things to
> support channel-last images (and implicitly treat 3D arrays as such). For
> example, DL frameworks that assemble channeled images into minibatches
> (with different conventions like BHWC and BCHW), you'd want the n=4
> behavior to do different things. I _think_ you'd just want to do those with
> different functions than a complicated set of arguments to one function.
>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Robert Kern
On Fri, Feb 12, 2021 at 3:42 PM Ralf Gommers  wrote:

>
> On Fri, Feb 12, 2021 at 9:21 PM Robert Kern  wrote:
>
>> On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers 
>> wrote:
>>
>>>
>>> On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg <
>>> sebast...@sipsolutions.net> wrote:
>>>

 Right, my initial feeling it that without such context `atleast_3d` is
 pretty surprising.  So I wonder if we can design `atleast_nd` in a way
 that it is explicit about this context.

>>>
>>> Agreed. I think such a use case is probably too specific to design a
>>> single function for, at least in such a hardcoded way.
>>>
>>
>> That might be an argument for not designing a new one (or at least not
>> giving it such a name). Not sure it's a good argument for removing a
>> long-standing one.
>>
>
> I agree. I'm not sure deprecating is best. But introducing new
> functionality where `nd(pos=3) != 3d` is also not great.
>
> At the very least, atleast_3d should be better documented. It also is
> telling that Juan (a long-time) scikit-image dev doesn't like atleast_3d
> and there's very little usage of it in scikit-image.
>

I'm fairly neutral on atleast_nd(). I think that for n=1 and n=2, you can
derive The One Way to Do It from broadcasting semantics, but for n>=3, I'm
not sure there's much value in trying to systematize it to a single
convention. I think that once you get up to those dimensions, you start to
want to have domain-specific semantics. I do agree that, in retrospect,
atleast_3d() probably should have been named more specifically. It was of a
piece of other conveniences like dstack() that did special things to
support channel-last images (and implicitly treat 3D arrays as such). For
example, DL frameworks that assemble channeled images into minibatches
(with different conventions like BHWC and BCHW), you'd want the n=4
behavior to do different things. I _think_ you'd just want to do those with
different functions than a complicated set of arguments to one function.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Ralf Gommers
On Fri, Feb 12, 2021 at 9:21 PM Robert Kern  wrote:

> On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers 
> wrote:
>
>>
>> On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg <
>> sebast...@sipsolutions.net> wrote:
>>
>>> On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote:
>>> > On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <
>>> > jfoxrabinov...@gmail.com> wrote:
>>> >
>>> > >
>>> > >
>>> > > On Fri, Feb 12, 2021, 09:32 Robert Kern 
>>> > > wrote:
>>> > >
>>> > > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <
>>> > > > wieser.eric+nu...@gmail.com>
>>> > > > wrote:
>>> > > >
>>> > > > > > There might be some linear algebraic reason why those axis
>>> > > > > > positions
>>> > > > > make sense, but I’m not aware of it...
>>> > > > >
>>> > > > > My guess is that the historical motivation was to allow
>>> > > > > grayscale `(H,
>>> > > > > W)` images to be converted into `(H, W, 1)` images so that they
>>> > > > > can be
>>> > > > > broadcast against `(H, W, 3)` RGB images.
>>> > > > >
>>> > > >
>>> > > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd
>>> > > > deprecate and remove the one existing function that *isn't* made
>>> > > > redundant
>>> > > > thereby.
>>> > > >
>>> > >
>>> > > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
>>> > > argument lets you tell it where to put the new axes. What's
>>> > > unintuitive to
>>> > > my is that the 1D case gets promoted to from shape `(x,)` to shape
>>> > > `(1, x,
>>> > > 1)`. It takes two calls to `atleast_nd` to replicate that behavior.
>>> > >
>>> >
>>> > When thinking about channeled images, the channel axis is not of the
>>> > same
>>> > kind as the H and W axes. Really, you tend to want to think about an
>>> > RGB
>>> > image as a (H, W) array of colors rather than an (H, W, 3) ndarray of
>>> > intensity values. As much as possible, you want to treat RGB images
>>> > similar
>>> > to (H, W)-shaped grayscale images. Let's say I want to make a
>>> > separable
>>> > filter to convolve with my image, that is, we have a 1D filter for
>>> > each of
>>> > the H and W axes, and they are repeated for each channel, if RGB.
>>> > Setting
>>> > up a separable filter for (H, W) grayscale is straightforward with
>>> > broadcasting semantics. I can use (ntaps,)-shaped vector for the W
>>> > axis and
>>> > (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB
>>> > case, I
>>> > want the same thing. atleast_3d() adapts those correctly for the (H,
>>> > W,
>>> > nchannels) case.
>>>
>>> Right, my initial feeling it that without such context `atleast_3d` is
>>> pretty surprising.  So I wonder if we can design `atleast_nd` in a way
>>> that it is explicit about this context.
>>>
>>
>> Agreed. I think such a use case is probably too specific to design a
>> single function for, at least in such a hardcoded way.
>>
>
> That might be an argument for not designing a new one (or at least not
> giving it such a name). Not sure it's a good argument for removing a
> long-standing one.
>

I agree. I'm not sure deprecating is best. But introducing new
functionality where `nd(pos=3) != 3d` is also not great.

At the very least, atleast_3d should be better documented. It also is
telling that Juan (a long-time) scikit-image dev doesn't like atleast_3d
and there's very little usage of it in scikit-image.

Cheers,
Ralf


> Broadcasting is a very powerful convention that makes coding with arrays
> tolerable. It makes some choices (namely, prepending 1s to the shape) to
> make some common operations with mixed-dimension arrays work "by default".
> But it doesn't cover all of the desired operations conveniently.
> atleast_3d() bridges the gap to an important convention for a major
> use-case of arrays.
>
> There's also "channels first" and "channels last" versions of RGB images
>> as 3-D arrays, and "channels first" is the default in most deep learning
>> frameworks - so the choice atleast_3d makes is a little outdated by now.
>>
>
> DL frameworks do not constitute the majority of image processing code,
> which has a very strong channels-last contingent. But nonetheless, the very
> popular Tensorflow defaults to channels-last.
>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Robert Kern
On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers  wrote:

>
> On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg 
> wrote:
>
>> On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote:
>> > On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <
>> > jfoxrabinov...@gmail.com> wrote:
>> >
>> > >
>> > >
>> > > On Fri, Feb 12, 2021, 09:32 Robert Kern 
>> > > wrote:
>> > >
>> > > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <
>> > > > wieser.eric+nu...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > > There might be some linear algebraic reason why those axis
>> > > > > > positions
>> > > > > make sense, but I’m not aware of it...
>> > > > >
>> > > > > My guess is that the historical motivation was to allow
>> > > > > grayscale `(H,
>> > > > > W)` images to be converted into `(H, W, 1)` images so that they
>> > > > > can be
>> > > > > broadcast against `(H, W, 3)` RGB images.
>> > > > >
>> > > >
>> > > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd
>> > > > deprecate and remove the one existing function that *isn't* made
>> > > > redundant
>> > > > thereby.
>> > > >
>> > >
>> > > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
>> > > argument lets you tell it where to put the new axes. What's
>> > > unintuitive to
>> > > my is that the 1D case gets promoted to from shape `(x,)` to shape
>> > > `(1, x,
>> > > 1)`. It takes two calls to `atleast_nd` to replicate that behavior.
>> > >
>> >
>> > When thinking about channeled images, the channel axis is not of the
>> > same
>> > kind as the H and W axes. Really, you tend to want to think about an
>> > RGB
>> > image as a (H, W) array of colors rather than an (H, W, 3) ndarray of
>> > intensity values. As much as possible, you want to treat RGB images
>> > similar
>> > to (H, W)-shaped grayscale images. Let's say I want to make a
>> > separable
>> > filter to convolve with my image, that is, we have a 1D filter for
>> > each of
>> > the H and W axes, and they are repeated for each channel, if RGB.
>> > Setting
>> > up a separable filter for (H, W) grayscale is straightforward with
>> > broadcasting semantics. I can use (ntaps,)-shaped vector for the W
>> > axis and
>> > (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB
>> > case, I
>> > want the same thing. atleast_3d() adapts those correctly for the (H,
>> > W,
>> > nchannels) case.
>>
>> Right, my initial feeling it that without such context `atleast_3d` is
>> pretty surprising.  So I wonder if we can design `atleast_nd` in a way
>> that it is explicit about this context.
>>
>
> Agreed. I think such a use case is probably too specific to design a
> single function for, at least in such a hardcoded way.
>

That might be an argument for not designing a new one (or at least not
giving it such a name). Not sure it's a good argument for removing a
long-standing one.

Broadcasting is a very powerful convention that makes coding with arrays
tolerable. It makes some choices (namely, prepending 1s to the shape) to
make some common operations with mixed-dimension arrays work "by default".
But it doesn't cover all of the desired operations conveniently.
atleast_3d() bridges the gap to an important convention for a major
use-case of arrays.

There's also "channels first" and "channels last" versions of RGB images as
> 3-D arrays, and "channels first" is the default in most deep learning
> frameworks - so the choice atleast_3d makes is a little outdated by now.
>

DL frameworks do not constitute the majority of image processing code,
which has a very strong channels-last contingent. But nonetheless, the very
popular Tensorflow defaults to channels-last.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Ralf Gommers
On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg 
wrote:

> On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote:
> > On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <
> > jfoxrabinov...@gmail.com> wrote:
> >
> > >
> > >
> > > On Fri, Feb 12, 2021, 09:32 Robert Kern 
> > > wrote:
> > >
> > > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <
> > > > wieser.eric+nu...@gmail.com>
> > > > wrote:
> > > >
> > > > > > There might be some linear algebraic reason why those axis
> > > > > > positions
> > > > > make sense, but I’m not aware of it...
> > > > >
> > > > > My guess is that the historical motivation was to allow
> > > > > grayscale `(H,
> > > > > W)` images to be converted into `(H, W, 1)` images so that they
> > > > > can be
> > > > > broadcast against `(H, W, 3)` RGB images.
> > > > >
> > > >
> > > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd
> > > > deprecate and remove the one existing function that *isn't* made
> > > > redundant
> > > > thereby.
> > > >
> > >
> > > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
> > > argument lets you tell it where to put the new axes. What's
> > > unintuitive to
> > > my is that the 1D case gets promoted to from shape `(x,)` to shape
> > > `(1, x,
> > > 1)`. It takes two calls to `atleast_nd` to replicate that behavior.
> > >
> >
> > When thinking about channeled images, the channel axis is not of the
> > same
> > kind as the H and W axes. Really, you tend to want to think about an
> > RGB
> > image as a (H, W) array of colors rather than an (H, W, 3) ndarray of
> > intensity values. As much as possible, you want to treat RGB images
> > similar
> > to (H, W)-shaped grayscale images. Let's say I want to make a
> > separable
> > filter to convolve with my image, that is, we have a 1D filter for
> > each of
> > the H and W axes, and they are repeated for each channel, if RGB.
> > Setting
> > up a separable filter for (H, W) grayscale is straightforward with
> > broadcasting semantics. I can use (ntaps,)-shaped vector for the W
> > axis and
> > (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB
> > case, I
> > want the same thing. atleast_3d() adapts those correctly for the (H,
> > W,
> > nchannels) case.
>
> Right, my initial feeling it that without such context `atleast_3d` is
> pretty surprising.  So I wonder if we can design `atleast_nd` in a way
> that it is explicit about this context.
>

Agreed. I think such a use case is probably too specific to design a single
function for, at least in such a hardcoded way. There's also "channels
first" and "channels last" versions of RGB images as 3-D arrays, and
"channels first" is the default in most deep learning frameworks - so the
choice atleast_3d makes is a little outdated by now.

Cheers,
Ralf


> The `pos` argument is the current solution to this, but maybe is a
> better way [2]?  Meshgrid for example defaults to `indexing='xy'` and
> has `indexing='ij'` for a similar purpose [1].
>
> Of course, if `atleast_3d` is common enough, I guess that argument
> could also swing to adding a keyword-only argument to `atleast_3d`
> (that way we can/will never change the default).
>
> - Sebastian
>
>
> [1] Not sure the purposes are comparable, but in both cases, they
> provide information about the "context" in which meshgrid/atleast_3d
> are used.
>
> [2] It feels a bit like you may have to think about what `pos=3` will
> actually do (in the sense, that we will all just end up doing trial and
> error :)). At which point I am not sure there is too much gained over
> the surprise of `atleast_3d`.
>
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Sebastian Berg
On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote:
> On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <
> jfoxrabinov...@gmail.com> wrote:
> 
> > 
> > 
> > On Fri, Feb 12, 2021, 09:32 Robert Kern 
> > wrote:
> > 
> > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser <
> > > wieser.eric+nu...@gmail.com>
> > > wrote:
> > > 
> > > > > There might be some linear algebraic reason why those axis
> > > > > positions
> > > > make sense, but I’m not aware of it...
> > > > 
> > > > My guess is that the historical motivation was to allow
> > > > grayscale `(H,
> > > > W)` images to be converted into `(H, W, 1)` images so that they
> > > > can be
> > > > broadcast against `(H, W, 3)` RGB images.
> > > > 
> > > 
> > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd
> > > deprecate and remove the one existing function that *isn't* made
> > > redundant
> > > thereby.
> > > 
> > 
> > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
> > argument lets you tell it where to put the new axes. What's
> > unintuitive to
> > my is that the 1D case gets promoted to from shape `(x,)` to shape
> > `(1, x,
> > 1)`. It takes two calls to `atleast_nd` to replicate that behavior.
> > 
> 
> When thinking about channeled images, the channel axis is not of the
> same
> kind as the H and W axes. Really, you tend to want to think about an
> RGB
> image as a (H, W) array of colors rather than an (H, W, 3) ndarray of
> intensity values. As much as possible, you want to treat RGB images
> similar
> to (H, W)-shaped grayscale images. Let's say I want to make a
> separable
> filter to convolve with my image, that is, we have a 1D filter for
> each of
> the H and W axes, and they are repeated for each channel, if RGB.
> Setting
> up a separable filter for (H, W) grayscale is straightforward with
> broadcasting semantics. I can use (ntaps,)-shaped vector for the W
> axis and
> (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB
> case, I
> want the same thing. atleast_3d() adapts those correctly for the (H,
> W,
> nchannels) case.

Right, my initial feeling it that without such context `atleast_3d` is
pretty surprising.  So I wonder if we can design `atleast_nd` in a way
that it is explicit about this context.

The `pos` argument is the current solution to this, but maybe is a
better way [2]?  Meshgrid for example defaults to `indexing='xy'` and
has `indexing='ij'` for a similar purpose [1].

Of course, if `atleast_3d` is common enough, I guess that argument
could also swing to adding a keyword-only argument to `atleast_3d`
(that way we can/will never change the default).

- Sebastian


[1] Not sure the purposes are comparable, but in both cases, they
provide information about the "context" in which meshgrid/atleast_3d
are used.

[2] It feels a bit like you may have to think about what `pos=3` will
actually do (in the sense, that we will all just end up doing trial and
error :)). At which point I am not sure there is too much gained over
the surprise of `atleast_3d`. 

> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Robert Kern
On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz <
jfoxrabinov...@gmail.com> wrote:

>
>
> On Fri, Feb 12, 2021, 09:32 Robert Kern  wrote:
>
>> On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser 
>> wrote:
>>
>>> > There might be some linear algebraic reason why those axis positions
>>> make sense, but I’m not aware of it...
>>>
>>> My guess is that the historical motivation was to allow grayscale `(H,
>>> W)` images to be converted into `(H, W, 1)` images so that they can be
>>> broadcast against `(H, W, 3)` RGB images.
>>>
>>
>> Correct. If you do introduce atleast_nd(), I'm not sure why you'd
>> deprecate and remove the one existing function that *isn't* made redundant
>> thereby.
>>
>
> `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
> argument lets you tell it where to put the new axes. What's unintuitive to
> my is that the 1D case gets promoted to from shape `(x,)` to shape `(1, x,
> 1)`. It takes two calls to `atleast_nd` to replicate that behavior.
>

When thinking about channeled images, the channel axis is not of the same
kind as the H and W axes. Really, you tend to want to think about an RGB
image as a (H, W) array of colors rather than an (H, W, 3) ndarray of
intensity values. As much as possible, you want to treat RGB images similar
to (H, W)-shaped grayscale images. Let's say I want to make a separable
filter to convolve with my image, that is, we have a 1D filter for each of
the H and W axes, and they are repeated for each channel, if RGB. Setting
up a separable filter for (H, W) grayscale is straightforward with
broadcasting semantics. I can use (ntaps,)-shaped vector for the W axis and
(ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB case, I
want the same thing. atleast_3d() adapts those correctly for the (H, W,
nchannels) case.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Joseph Fox-Rabinovitz
On Fri, Feb 12, 2021, 09:32 Robert Kern  wrote:

> On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser 
> wrote:
>
>> > There might be some linear algebraic reason why those axis positions
>> make sense, but I’m not aware of it...
>>
>> My guess is that the historical motivation was to allow grayscale `(H,
>> W)` images to be converted into `(H, W, 1)` images so that they can be
>> broadcast against `(H, W, 3)` RGB images.
>>
>
> Correct. If you do introduce atleast_nd(), I'm not sure why you'd
> deprecate and remove the one existing function that *isn't* made redundant
> thereby.
>

`atleast_nd` handles the promotion of 2D to 3D correctly. The `pos`
argument lets you tell it where to put the new axes. What's unintuitive to
my is that the 1D case gets promoted to from shape `(x,)` to shape `(1, x,
1)`. It takes two calls to `atleast_nd` to replicate that behavior.

One modification to `atleast_nd` I've thought about is making `pos` refer
to the position of the existing axes in the new array rather than the
position of the new axes, but that's likely not a useful way to go about it.

- Joe


> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Robert Kern
On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser 
wrote:

> > There might be some linear algebraic reason why those axis positions
> make sense, but I’m not aware of it...
>
> My guess is that the historical motivation was to allow grayscale `(H, W)`
> images to be converted into `(H, W, 1)` images so that they can be
> broadcast against `(H, W, 3)` RGB images.
>

Correct. If you do introduce atleast_nd(), I'm not sure why you'd deprecate
and remove the one existing function that *isn't* made redundant thereby.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Eric Wieser
> There might be some linear algebraic reason why those axis positions make
sense, but I’m not aware of it...

My guess is that the historical motivation was to allow grayscale `(H, W)`
images to be converted into `(H, W, 1)` images so that they can be
broadcast against `(H, W, 3)` RGB images.

Eric

On Fri, 12 Feb 2021 at 02:32, Juan Nunez-Iglesias  wrote:

> both napari and scikit-image use atleast_ a few times. I don’t have many
> examples of where I used nd because it didn’t exist. But I have the very
> distinct impression of needing it repeatedly. In some places, I’ve used
> `np.broadcast_to` to signal the same intention, where `atleast_nd` would
> have been the more readable solution.
>
> I don’t buy the argument that it’s just a way to mask errors. NumPy
> broadcasting also has that same potential but I hope no one would seriously
> consider deprecating it. Indeed, even if we accept that we (library
> authors) should force users to provide an array of the right
> dimensionality, that still argues for making it convenient for users to do
> that!
>
> I don’t feel super strongly about this. But I think atleast_nd is a move
> in a positive direction and I’d prefer  it to what’s there now:
>
> In [1]: import numpy as np
> In [2]: np.atleast_3d(np.ones(4)).shape
> Out[2]: (1, 4, 1)
>
> There might be some linear algebraic reason why those axis positions make
> sense, but I’m not aware of it...
>
> Juan.
>
> On 12 Feb 2021, at 5:32 am, Eric Wieser 
> wrote:
>
> I did a quick search of matplotlib, and found a few uses of all three
> functions:
>
> *
> https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
>   This one isn't really numpy at all, and is really just a shorthand for
> normalizing an argument `x=n` to `x=[n, n]`
> *
> https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
>This one is the classic "either multivariate or single-variable data"
> thing endemic to the SciPy ecosystem.
> *
> https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326
>   Matplotlib has their own `_check_1d` function for input sanitization,
> although github says it's only used to parse the arguments to `plot`, which
> at this point are fairly established as being flexible.
> *
> https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
>   This just looks like "defensive programming", and if the argument isn't
> already 3d then something is probably wrong.
>
> This isn't an exhaustive list, just a handful of different situations the
> functions were used.
>
> Eric
>
>
>
> On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer  wrote:
>
>> On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root 
>> wrote:
>>
>>> for me, I find that the at_least{1,2,3}d functions are useful for
>>> sanitizing inputs. Having an at_leastnd() function can be viewed as a step
>>> towards cleaning up the API, not cluttering it (although, deprecations of
>>> the existing functions probably should be long given how long they have
>>> existed).
>>>
>>
>> I would love to see examples of this -- perhaps in matplotlib?
>>
>> My thinking is that in most cases it's probably a better idea to keep the
>> interface simpler, and raise an error for lower-dimensional arrays.
>> Automatic conversion is convenient (and endemic within the SciPy
>> ecosystem), but is also a common source of bugs.
>>
>> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  wrote:
>>>
 On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
 wrote:

> I totally agree with the namespace clutter concern, but honestly, I
> would use `atleast_nd` with its `pos` argument (I might rename it to
> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`,
> for which I had no idea where the new axes would end up.
>
> So, I’m in favour of including it, and optionally deprecating
> `atleast_{1,2,3}d`.
>
>
 I appreciate that `atleast_nd` feels more sensible than
 `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not
 recommend is a good enough reason for inclusion in NumPy. It needs to stand
 on its own.

 What would be the recommended use-cases for this new function?
 Have any libraries building on top of NumPy implemented a version of
 this?


> Juan.
>
> On 11 Feb 2021, at 9:48 am, Sebastian Berg 
> wrote:
>
> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>
> I've created PR#18386 to add a function called atleast_nd to numpy and
> numpy.ma. This would generalize the existing atleast_1d, atleast_2d,
> and
> atleast_3d functions.
>
> I proposed a similar idea about four and a half years ago:
>
> 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-12 Thread Ralf Gommers
On Fri, Feb 12, 2021 at 3:32 AM Juan Nunez-Iglesias 
wrote:

> both napari and scikit-image use atleast_ a few times. I don’t have many
> examples of where I used nd because it didn’t exist. But I have the very
> distinct impression of needing it repeatedly. In some places, I’ve used
> `np.broadcast_to` to signal the same intention, where `atleast_nd` would
> have been the more readable solution.
>
> I don’t buy the argument that it’s just a way to mask errors. NumPy
> broadcasting also has that same potential but I hope no one would seriously
> consider deprecating it. Indeed, even if we accept that we (library
> authors) should force users to provide an array of the right
> dimensionality, that still argues for making it convenient for users to do
> that!
>
> I don’t feel super strongly about this. But I think atleast_nd is a move
> in a positive direction and I’d prefer  it to what’s there now:
>
> In [1]: import numpy as np
> In [2]: np.atleast_3d(np.ones(4)).shape
> Out[2]: (1, 4, 1)
>
> There might be some linear algebraic reason why those axis positions make
> sense, but I’m not aware of it...
>

Yes that's pretty weird. I'm also not sure there's a reason.

It would be good that, if atleast_nd is not going to replicate this
behavior, atleast_3d was deprecated (perhaps a release or two after
introduction of atleast_nd).

Not having `atleast_3d(x) == atleast_nd(x, pos=3)` is unnecessarily
confusing.

Ralf


> Juan.
>
> On 12 Feb 2021, at 5:32 am, Eric Wieser 
> wrote:
>
> I did a quick search of matplotlib, and found a few uses of all three
> functions:
>
> *
> https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
>   This one isn't really numpy at all, and is really just a shorthand for
> normalizing an argument `x=n` to `x=[n, n]`
> *
> https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
>This one is the classic "either multivariate or single-variable data"
> thing endemic to the SciPy ecosystem.
> *
> https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326
>   Matplotlib has their own `_check_1d` function for input sanitization,
> although github says it's only used to parse the arguments to `plot`, which
> at this point are fairly established as being flexible.
> *
> https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
>   This just looks like "defensive programming", and if the argument isn't
> already 3d then something is probably wrong.
>
> This isn't an exhaustive list, just a handful of different situations the
> functions were used.
>
> Eric
>
>
>
> On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer  wrote:
>
>> On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root 
>> wrote:
>>
>>> for me, I find that the at_least{1,2,3}d functions are useful for
>>> sanitizing inputs. Having an at_leastnd() function can be viewed as a step
>>> towards cleaning up the API, not cluttering it (although, deprecations of
>>> the existing functions probably should be long given how long they have
>>> existed).
>>>
>>
>> I would love to see examples of this -- perhaps in matplotlib?
>>
>> My thinking is that in most cases it's probably a better idea to keep the
>> interface simpler, and raise an error for lower-dimensional arrays.
>> Automatic conversion is convenient (and endemic within the SciPy
>> ecosystem), but is also a common source of bugs.
>>
>> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  wrote:
>>>
 On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
 wrote:

> I totally agree with the namespace clutter concern, but honestly, I
> would use `atleast_nd` with its `pos` argument (I might rename it to
> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`,
> for which I had no idea where the new axes would end up.
>
> So, I’m in favour of including it, and optionally deprecating
> `atleast_{1,2,3}d`.
>
>
 I appreciate that `atleast_nd` feels more sensible than
 `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not
 recommend is a good enough reason for inclusion in NumPy. It needs to stand
 on its own.

 What would be the recommended use-cases for this new function?
 Have any libraries building on top of NumPy implemented a version of
 this?


> Juan.
>
> On 11 Feb 2021, at 9:48 am, Sebastian Berg 
> wrote:
>
> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>
> I've created PR#18386 to add a function called atleast_nd to numpy and
> numpy.ma. This would generalize the existing atleast_1d, atleast_2d,
> and
> atleast_3d functions.
>
> I proposed a similar idea about four and a half years ago:
>
> 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-11 Thread Juan Nunez-Iglesias
both napari and scikit-image use atleast_ a few times. I don’t have many 
examples of where I used nd because it didn’t exist. But I have the very 
distinct impression of needing it repeatedly. In some places, I’ve used 
`np.broadcast_to` to signal the same intention, where `atleast_nd` would have 
been the more readable solution.

I don’t buy the argument that it’s just a way to mask errors. NumPy 
broadcasting also has that same potential but I hope no one would seriously 
consider deprecating it. Indeed, even if we accept that we (library authors) 
should force users to provide an array of the right dimensionality, that still 
argues for making it convenient for users to do that!

I don’t feel super strongly about this. But I think atleast_nd is a move in a 
positive direction and I’d prefer  it to what’s there now:

In [1]: import numpy as np
In [2]: np.atleast_3d(np.ones(4)).shape
Out[2]: (1, 4, 1)

There might be some linear algebraic reason why those axis positions make 
sense, but I’m not aware of it...

Juan.

> On 12 Feb 2021, at 5:32 am, Eric Wieser  wrote:
> 
> I did a quick search of matplotlib, and found a few uses of all three 
> functions:
> 
> * 
> https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
>  
> 
>   This one isn't really numpy at all, and is really just a shorthand for 
> normalizing an argument `x=n` to `x=[n, n]`
> * 
> https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
>  
> 
>This one is the classic "either multivariate or single-variable data" 
> thing endemic to the SciPy ecosystem.
> * 
> https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326
>  
> 
>   Matplotlib has their own `_check_1d` function for input sanitization, 
> although github says it's only used to parse the arguments to `plot`, which 
> at this point are fairly established as being flexible.
> * 
> https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
>  
> 
>   This just looks like "defensive programming", and if the argument isn't 
> already 3d then something is probably wrong.
> 
> This isn't an exhaustive list, just a handful of different situations the 
> functions were used.
> 
> Eric
> 
> 
> 
> On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer  > wrote:
> On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root  > wrote:
> for me, I find that the at_least{1,2,3}d functions are useful for sanitizing 
> inputs. Having an at_leastnd() function can be viewed as a step towards 
> cleaning up the API, not cluttering it (although, deprecations of the 
> existing functions probably should be long given how long they have existed).
> 
> I would love to see examples of this -- perhaps in matplotlib?
> 
> My thinking is that in most cases it's probably a better idea to keep the 
> interface simpler, and raise an error for lower-dimensional arrays. Automatic 
> conversion is convenient (and endemic within the SciPy ecosystem), but is 
> also a common source of bugs.
> 
> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  > wrote:
> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias  > wrote:
> I totally agree with the namespace clutter concern, but honestly, I would use 
> `atleast_nd` with its `pos` argument (I might rename it to `position`, 
> `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had 
> no idea where the new axes would end up.
> 
> So, I’m in favour of including it, and optionally deprecating 
> `atleast_{1,2,3}d`.
> 
> 
> I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, 
> but I don't think "better" than a pattern we would not recommend is a good 
> enough reason for inclusion in NumPy. It needs to stand on its own.
> 
> What would be the recommended use-cases for this new function?
> Have any libraries building on top of NumPy implemented a version of this?
>  
> Juan.
> 
>> On 11 Feb 2021, at 9:48 am, Sebastian Berg > > wrote:
>> 
>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>>> I've created PR#18386 to add a function called atleast_nd to numpy and
>>> numpy.ma . This would generalize the existing atleast_1d, 
>>> 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-11 Thread Eric Wieser
I did a quick search of matplotlib, and found a few uses of all three
functions:

*
https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446
  This one isn't really numpy at all, and is really just a shorthand for
normalizing an argument `x=n` to `x=[n, n]`
*
https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888
   This one is the classic "either multivariate or single-variable data"
thing endemic to the SciPy ecosystem.
*
https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326
  Matplotlib has their own `_check_1d` function for input sanitization,
although github says it's only used to parse the arguments to `plot`, which
at this point are fairly established as being flexible.
*
https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631
  This just looks like "defensive programming", and if the argument isn't
already 3d then something is probably wrong.

This isn't an exhaustive list, just a handful of different situations the
functions were used.

Eric



On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer  wrote:

> On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root 
> wrote:
>
>> for me, I find that the at_least{1,2,3}d functions are useful for
>> sanitizing inputs. Having an at_leastnd() function can be viewed as a step
>> towards cleaning up the API, not cluttering it (although, deprecations of
>> the existing functions probably should be long given how long they have
>> existed).
>>
>
> I would love to see examples of this -- perhaps in matplotlib?
>
> My thinking is that in most cases it's probably a better idea to keep the
> interface simpler, and raise an error for lower-dimensional arrays.
> Automatic conversion is convenient (and endemic within the SciPy
> ecosystem), but is also a common source of bugs.
>
> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  wrote:
>>
>>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
>>> wrote:
>>>
 I totally agree with the namespace clutter concern, but honestly, I
 would use `atleast_nd` with its `pos` argument (I might rename it to
 `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`,
 for which I had no idea where the new axes would end up.

 So, I’m in favour of including it, and optionally deprecating
 `atleast_{1,2,3}d`.


>>> I appreciate that `atleast_nd` feels more sensible than
>>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not
>>> recommend is a good enough reason for inclusion in NumPy. It needs to stand
>>> on its own.
>>>
>>> What would be the recommended use-cases for this new function?
>>> Have any libraries building on top of NumPy implemented a version of
>>> this?
>>>
>>>
 Juan.

 On 11 Feb 2021, at 9:48 am, Sebastian Berg 
 wrote:

 On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:

 I've created PR#18386 to add a function called atleast_nd to numpy and
 numpy.ma. This would generalize the existing atleast_1d, atleast_2d,
 and
 atleast_3d functions.

 I proposed a similar idea about four and a half years ago:
 https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html
 ,
 PR#7804. The reception was ambivalent, but a couple of folks have asked
 me
 about this, so I'm bringing it back.

 Some pros:

 - This closes issue #12336
 - There are a couple of Stack Overflow questions that would benefit
 - Been asked about this a couple of times
 - Implementation of three existing atleast_*d functions gets easier
 - Looks nicer that the equivalent broadcasting and reshaping

 Some cons:

 - Cluttering up the API
 - Maintenance burden (but not a big one)
 - This is just a utility function, which can be achieved through
 broadcasting and reshaping


 My main concern would be the namespace cluttering. I can't say I use
 even the `atleast_2d` etc. functions personally, so I would tend to be
 slightly against the addition. But if others land on the "useful" side here
 (and it seemed a bit at least on github), I am also not opposed.  It is a
 clean name that lines up with existing ones, so it doesn't seem like a big
 "mental load" with respect to namespace cluttering.

 Bike shedding the API is probably a good idea in any case.

 I have pasted the current PR documentation (as html) below for quick
 reference. I wonder a bit about the reasoning for having `pos` specify a
 value rather than just a side?



 numpy.atleast_nd(*ary*, *ndim*, *pos=0*)
 View input as array with at least ndim dimensions.
 New unit dimensions are inserted at the index given by *pos* if
 necessary.
 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-11 Thread Benjamin Root
My original usecase for these was dealing with output data from Matlab
where those users would use `squeeze()` quite liberally. In addition, there
was the problem of the implicit squeeze() in the numpy's loadtxt() for
which I added the ndmin kwarg for in case an input CSV file had just one
row or no rows.

np.atleast_1d() is used in matplotlib in a bunch of places where inputs are
allowed to be scalar or lists.

On Thu, Feb 11, 2021 at 1:15 PM Stephan Hoyer  wrote:

> On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root 
> wrote:
>
>> for me, I find that the at_least{1,2,3}d functions are useful for
>> sanitizing inputs. Having an at_leastnd() function can be viewed as a step
>> towards cleaning up the API, not cluttering it (although, deprecations of
>> the existing functions probably should be long given how long they have
>> existed).
>>
>
> I would love to see examples of this -- perhaps in matplotlib?
>
> My thinking is that in most cases it's probably a better idea to keep the
> interface simpler, and raise an error for lower-dimensional arrays.
> Automatic conversion is convenient (and endemic within the SciPy
> ecosystem), but is also a common source of bugs.
>
> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  wrote:
>>
>>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
>>> wrote:
>>>
 I totally agree with the namespace clutter concern, but honestly, I
 would use `atleast_nd` with its `pos` argument (I might rename it to
 `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`,
 for which I had no idea where the new axes would end up.

 So, I’m in favour of including it, and optionally deprecating
 `atleast_{1,2,3}d`.


>>> I appreciate that `atleast_nd` feels more sensible than
>>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not
>>> recommend is a good enough reason for inclusion in NumPy. It needs to stand
>>> on its own.
>>>
>>> What would be the recommended use-cases for this new function?
>>> Have any libraries building on top of NumPy implemented a version of
>>> this?
>>>
>>>
 Juan.

 On 11 Feb 2021, at 9:48 am, Sebastian Berg 
 wrote:

 On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:

 I've created PR#18386 to add a function called atleast_nd to numpy and
 numpy.ma. This would generalize the existing atleast_1d, atleast_2d,
 and
 atleast_3d functions.

 I proposed a similar idea about four and a half years ago:
 https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html
 ,
 PR#7804. The reception was ambivalent, but a couple of folks have asked
 me
 about this, so I'm bringing it back.

 Some pros:

 - This closes issue #12336
 - There are a couple of Stack Overflow questions that would benefit
 - Been asked about this a couple of times
 - Implementation of three existing atleast_*d functions gets easier
 - Looks nicer that the equivalent broadcasting and reshaping

 Some cons:

 - Cluttering up the API
 - Maintenance burden (but not a big one)
 - This is just a utility function, which can be achieved through
 broadcasting and reshaping


 My main concern would be the namespace cluttering. I can't say I use
 even the `atleast_2d` etc. functions personally, so I would tend to be
 slightly against the addition. But if others land on the "useful" side here
 (and it seemed a bit at least on github), I am also not opposed.  It is a
 clean name that lines up with existing ones, so it doesn't seem like a big
 "mental load" with respect to namespace cluttering.

 Bike shedding the API is probably a good idea in any case.

 I have pasted the current PR documentation (as html) below for quick
 reference. I wonder a bit about the reasoning for having `pos` specify a
 value rather than just a side?



 numpy.atleast_nd(*ary*, *ndim*, *pos=0*)
 View input as array with at least ndim dimensions.
 New unit dimensions are inserted at the index given by *pos* if
 necessary.
 Parameters*ary  *array_like
 The input array. Non-array inputs are converted to arrays. Arrays that
 already have ndim or more dimensions are preserved.
 *ndim  *int
 The minimum number of dimensions required.
 *pos  *int, optional
 The index to insert the new dimensions. May range from -ary.ndim - 1 to
  +ary.ndim (inclusive). Non-negative indices indicate locations before
 the corresponding axis: pos=0 means to insert at the very beginning.
 Negative indices indicate locations after the corresponding axis:
 pos=-1 means to insert at the very end. 0 and -1 are always guaranteed
 to work. Any other number will depend on the dimensions of the existing
 array. Default is 0.
 Returns*res  *ndarray
 An array with res.ndim >= ndim. A view is returned for array 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-11 Thread Stephan Hoyer
On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root  wrote:

> for me, I find that the at_least{1,2,3}d functions are useful for
> sanitizing inputs. Having an at_leastnd() function can be viewed as a step
> towards cleaning up the API, not cluttering it (although, deprecations of
> the existing functions probably should be long given how long they have
> existed).
>

I would love to see examples of this -- perhaps in matplotlib?

My thinking is that in most cases it's probably a better idea to keep the
interface simpler, and raise an error for lower-dimensional arrays.
Automatic conversion is convenient (and endemic within the SciPy
ecosystem), but is also a common source of bugs.

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  wrote:
>
>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
>> wrote:
>>
>>> I totally agree with the namespace clutter concern, but honestly, I
>>> would use `atleast_nd` with its `pos` argument (I might rename it to
>>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`,
>>> for which I had no idea where the new axes would end up.
>>>
>>> So, I’m in favour of including it, and optionally deprecating
>>> `atleast_{1,2,3}d`.
>>>
>>>
>> I appreciate that `atleast_nd` feels more sensible than
>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not
>> recommend is a good enough reason for inclusion in NumPy. It needs to stand
>> on its own.
>>
>> What would be the recommended use-cases for this new function?
>> Have any libraries building on top of NumPy implemented a version of this?
>>
>>
>>> Juan.
>>>
>>> On 11 Feb 2021, at 9:48 am, Sebastian Berg 
>>> wrote:
>>>
>>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>>>
>>> I've created PR#18386 to add a function called atleast_nd to numpy and
>>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
>>> atleast_3d functions.
>>>
>>> I proposed a similar idea about four and a half years ago:
>>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html
>>> ,
>>> PR#7804. The reception was ambivalent, but a couple of folks have asked
>>> me
>>> about this, so I'm bringing it back.
>>>
>>> Some pros:
>>>
>>> - This closes issue #12336
>>> - There are a couple of Stack Overflow questions that would benefit
>>> - Been asked about this a couple of times
>>> - Implementation of three existing atleast_*d functions gets easier
>>> - Looks nicer that the equivalent broadcasting and reshaping
>>>
>>> Some cons:
>>>
>>> - Cluttering up the API
>>> - Maintenance burden (but not a big one)
>>> - This is just a utility function, which can be achieved through
>>> broadcasting and reshaping
>>>
>>>
>>> My main concern would be the namespace cluttering. I can't say I use
>>> even the `atleast_2d` etc. functions personally, so I would tend to be
>>> slightly against the addition. But if others land on the "useful" side here
>>> (and it seemed a bit at least on github), I am also not opposed.  It is a
>>> clean name that lines up with existing ones, so it doesn't seem like a big
>>> "mental load" with respect to namespace cluttering.
>>>
>>> Bike shedding the API is probably a good idea in any case.
>>>
>>> I have pasted the current PR documentation (as html) below for quick
>>> reference. I wonder a bit about the reasoning for having `pos` specify a
>>> value rather than just a side?
>>>
>>>
>>>
>>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*)
>>> View input as array with at least ndim dimensions.
>>> New unit dimensions are inserted at the index given by *pos* if
>>> necessary.
>>> Parameters*ary  *array_like
>>> The input array. Non-array inputs are converted to arrays. Arrays that
>>> already have ndim or more dimensions are preserved.
>>> *ndim  *int
>>> The minimum number of dimensions required.
>>> *pos  *int, optional
>>> The index to insert the new dimensions. May range from -ary.ndim - 1 to
>>> +ary.ndim (inclusive). Non-negative indices indicate locations before
>>> the corresponding axis: pos=0 means to insert at the very beginning.
>>> Negative indices indicate locations after the corresponding axis: pos=-1
>>>  means to insert at the very end. 0 and -1 are always guaranteed to
>>> work. Any other number will depend on the dimensions of the existing array.
>>> Default is 0.
>>> Returns*res  *ndarray
>>> An array with res.ndim >= ndim. A view is returned for array inputs.
>>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of
>>> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions
>>> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) 
>>> becomes
>>> a view of shape (M, N, 1, 1)when ndim=4.
>>> *See also*
>>> atleast_1d
>>> 
>>> , atleast_2d
>>> 
>>> , 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-11 Thread Eric Wieser
> I find that the at_least{1,2,3}d functions are useful for sanitizing
inputs

IMO, this type of "sanitization" goes against "In the face of ambiguity,
refuse the temptation to guess".
Instead of using `at_least{n}d`, it could be argued that `if np.ndim(x) !=
n: raise ValueError` is a safer bet, which forces the user to think about
what's actually going on, and saves them from silent headaches.

Of course, this is just an argument for discouraging users from using these
functions, and for the fact that we perhaps should not have had them in the
first place.
Given we already have some of them, adding `atleast_nd` probably isn't
going to make things any worse.
In principle, it could actually make things better, as we could put a
"Notes" section in the new function docs that describes the XY problem that
makes atleast_nd look like a better solution that it is and presents better
alternatives, and the other three function docs could link there.

Eric

On Thu, 11 Feb 2021 at 17:41, Benjamin Root  wrote:

> for me, I find that the at_least{1,2,3}d functions are useful for
> sanitizing inputs. Having an at_leastnd() function can be viewed as a step
> towards cleaning up the API, not cluttering it (although, deprecations of
> the existing functions probably should be long given how long they have
> existed).
>
> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  wrote:
>
>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
>> wrote:
>>
>>> I totally agree with the namespace clutter concern, but honestly, I
>>> would use `atleast_nd` with its `pos` argument (I might rename it to
>>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`,
>>> for which I had no idea where the new axes would end up.
>>>
>>> So, I’m in favour of including it, and optionally deprecating
>>> `atleast_{1,2,3}d`.
>>>
>>>
>> I appreciate that `atleast_nd` feels more sensible than
>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not
>> recommend is a good enough reason for inclusion in NumPy. It needs to stand
>> on its own.
>>
>> What would be the recommended use-cases for this new function?
>> Have any libraries building on top of NumPy implemented a version of this?
>>
>>
>>> Juan.
>>>
>>> On 11 Feb 2021, at 9:48 am, Sebastian Berg 
>>> wrote:
>>>
>>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>>>
>>> I've created PR#18386 to add a function called atleast_nd to numpy and
>>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
>>> atleast_3d functions.
>>>
>>> I proposed a similar idea about four and a half years ago:
>>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html
>>> ,
>>> PR#7804. The reception was ambivalent, but a couple of folks have asked
>>> me
>>> about this, so I'm bringing it back.
>>>
>>> Some pros:
>>>
>>> - This closes issue #12336
>>> - There are a couple of Stack Overflow questions that would benefit
>>> - Been asked about this a couple of times
>>> - Implementation of three existing atleast_*d functions gets easier
>>> - Looks nicer that the equivalent broadcasting and reshaping
>>>
>>> Some cons:
>>>
>>> - Cluttering up the API
>>> - Maintenance burden (but not a big one)
>>> - This is just a utility function, which can be achieved through
>>> broadcasting and reshaping
>>>
>>>
>>> My main concern would be the namespace cluttering. I can't say I use
>>> even the `atleast_2d` etc. functions personally, so I would tend to be
>>> slightly against the addition. But if others land on the "useful" side here
>>> (and it seemed a bit at least on github), I am also not opposed.  It is a
>>> clean name that lines up with existing ones, so it doesn't seem like a big
>>> "mental load" with respect to namespace cluttering.
>>>
>>> Bike shedding the API is probably a good idea in any case.
>>>
>>> I have pasted the current PR documentation (as html) below for quick
>>> reference. I wonder a bit about the reasoning for having `pos` specify a
>>> value rather than just a side?
>>>
>>>
>>>
>>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*)
>>> View input as array with at least ndim dimensions.
>>> New unit dimensions are inserted at the index given by *pos* if
>>> necessary.
>>> Parameters*ary  *array_like
>>> The input array. Non-array inputs are converted to arrays. Arrays that
>>> already have ndim or more dimensions are preserved.
>>> *ndim  *int
>>> The minimum number of dimensions required.
>>> *pos  *int, optional
>>> The index to insert the new dimensions. May range from -ary.ndim - 1 to
>>> +ary.ndim (inclusive). Non-negative indices indicate locations before
>>> the corresponding axis: pos=0 means to insert at the very beginning.
>>> Negative indices indicate locations after the corresponding axis: pos=-1
>>>  means to insert at the very end. 0 and -1 are always guaranteed to
>>> work. Any other number will depend on the dimensions of the existing array.
>>> Default is 0.
>>> Returns*res  *ndarray
>>> An array with 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-11 Thread Joseph Fox-Rabinovitz
The original functions appear to have been written for things like *stack
originally, which actually goes a long way to explaining the inconsistent
argument list.

- Joe


On Thu, Feb 11, 2021, 12:41 Benjamin Root  wrote:

> for me, I find that the at_least{1,2,3}d functions are useful for
> sanitizing inputs. Having an at_leastnd() function can be viewed as a step
> towards cleaning up the API, not cluttering it (although, deprecations of
> the existing functions probably should be long given how long they have
> existed).
>
> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  wrote:
>
>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
>> wrote:
>>
>>> I totally agree with the namespace clutter concern, but honestly, I
>>> would use `atleast_nd` with its `pos` argument (I might rename it to
>>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`,
>>> for which I had no idea where the new axes would end up.
>>>
>>> So, I’m in favour of including it, and optionally deprecating
>>> `atleast_{1,2,3}d`.
>>>
>>>
>> I appreciate that `atleast_nd` feels more sensible than
>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not
>> recommend is a good enough reason for inclusion in NumPy. It needs to stand
>> on its own.
>>
>> What would be the recommended use-cases for this new function?
>> Have any libraries building on top of NumPy implemented a version of this?
>>
>>
>>> Juan.
>>>
>>> On 11 Feb 2021, at 9:48 am, Sebastian Berg 
>>> wrote:
>>>
>>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>>>
>>> I've created PR#18386 to add a function called atleast_nd to numpy and
>>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
>>> atleast_3d functions.
>>>
>>> I proposed a similar idea about four and a half years ago:
>>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html
>>> ,
>>> PR#7804. The reception was ambivalent, but a couple of folks have asked
>>> me
>>> about this, so I'm bringing it back.
>>>
>>> Some pros:
>>>
>>> - This closes issue #12336
>>> - There are a couple of Stack Overflow questions that would benefit
>>> - Been asked about this a couple of times
>>> - Implementation of three existing atleast_*d functions gets easier
>>> - Looks nicer that the equivalent broadcasting and reshaping
>>>
>>> Some cons:
>>>
>>> - Cluttering up the API
>>> - Maintenance burden (but not a big one)
>>> - This is just a utility function, which can be achieved through
>>> broadcasting and reshaping
>>>
>>>
>>> My main concern would be the namespace cluttering. I can't say I use
>>> even the `atleast_2d` etc. functions personally, so I would tend to be
>>> slightly against the addition. But if others land on the "useful" side here
>>> (and it seemed a bit at least on github), I am also not opposed.  It is a
>>> clean name that lines up with existing ones, so it doesn't seem like a big
>>> "mental load" with respect to namespace cluttering.
>>>
>>> Bike shedding the API is probably a good idea in any case.
>>>
>>> I have pasted the current PR documentation (as html) below for quick
>>> reference. I wonder a bit about the reasoning for having `pos` specify a
>>> value rather than just a side?
>>>
>>>
>>>
>>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*)
>>> View input as array with at least ndim dimensions.
>>> New unit dimensions are inserted at the index given by *pos* if
>>> necessary.
>>> Parameters*ary  *array_like
>>> The input array. Non-array inputs are converted to arrays. Arrays that
>>> already have ndim or more dimensions are preserved.
>>> *ndim  *int
>>> The minimum number of dimensions required.
>>> *pos  *int, optional
>>> The index to insert the new dimensions. May range from -ary.ndim - 1 to
>>> +ary.ndim (inclusive). Non-negative indices indicate locations before
>>> the corresponding axis: pos=0 means to insert at the very beginning.
>>> Negative indices indicate locations after the corresponding axis: pos=-1
>>>  means to insert at the very end. 0 and -1 are always guaranteed to
>>> work. Any other number will depend on the dimensions of the existing array.
>>> Default is 0.
>>> Returns*res  *ndarray
>>> An array with res.ndim >= ndim. A view is returned for array inputs.
>>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of
>>> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions
>>> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) 
>>> becomes
>>> a view of shape (M, N, 1, 1)when ndim=4.
>>> *See also*
>>> atleast_1d
>>> 
>>> , atleast_2d
>>> 
>>> , atleast_3d
>>> 
>>> *Notes*
>>> 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-11 Thread Benjamin Root
for me, I find that the at_least{1,2,3}d functions are useful for
sanitizing inputs. Having an at_leastnd() function can be viewed as a step
towards cleaning up the API, not cluttering it (although, deprecations of
the existing functions probably should be long given how long they have
existed).

On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer  wrote:

> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
> wrote:
>
>> I totally agree with the namespace clutter concern, but honestly, I would
>> use `atleast_nd` with its `pos` argument (I might rename it to `position`,
>> `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I
>> had no idea where the new axes would end up.
>>
>> So, I’m in favour of including it, and optionally deprecating
>> `atleast_{1,2,3}d`.
>>
>>
> I appreciate that `atleast_nd` feels more sensible than
> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not
> recommend is a good enough reason for inclusion in NumPy. It needs to stand
> on its own.
>
> What would be the recommended use-cases for this new function?
> Have any libraries building on top of NumPy implemented a version of this?
>
>
>> Juan.
>>
>> On 11 Feb 2021, at 9:48 am, Sebastian Berg 
>> wrote:
>>
>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>>
>> I've created PR#18386 to add a function called atleast_nd to numpy and
>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
>> atleast_3d functions.
>>
>> I proposed a similar idea about four and a half years ago:
>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html,
>> PR#7804. The reception was ambivalent, but a couple of folks have asked me
>> about this, so I'm bringing it back.
>>
>> Some pros:
>>
>> - This closes issue #12336
>> - There are a couple of Stack Overflow questions that would benefit
>> - Been asked about this a couple of times
>> - Implementation of three existing atleast_*d functions gets easier
>> - Looks nicer that the equivalent broadcasting and reshaping
>>
>> Some cons:
>>
>> - Cluttering up the API
>> - Maintenance burden (but not a big one)
>> - This is just a utility function, which can be achieved through
>> broadcasting and reshaping
>>
>>
>> My main concern would be the namespace cluttering. I can't say I use even
>> the `atleast_2d` etc. functions personally, so I would tend to be slightly
>> against the addition. But if others land on the "useful" side here (and it
>> seemed a bit at least on github), I am also not opposed.  It is a clean
>> name that lines up with existing ones, so it doesn't seem like a big
>> "mental load" with respect to namespace cluttering.
>>
>> Bike shedding the API is probably a good idea in any case.
>>
>> I have pasted the current PR documentation (as html) below for quick
>> reference. I wonder a bit about the reasoning for having `pos` specify a
>> value rather than just a side?
>>
>>
>>
>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*)
>> View input as array with at least ndim dimensions.
>> New unit dimensions are inserted at the index given by *pos* if
>> necessary.
>> Parameters*ary  *array_like
>> The input array. Non-array inputs are converted to arrays. Arrays that
>> already have ndim or more dimensions are preserved.
>> *ndim  *int
>> The minimum number of dimensions required.
>> *pos  *int, optional
>> The index to insert the new dimensions. May range from -ary.ndim - 1 to
>> +ary.ndim (inclusive). Non-negative indices indicate locations before
>> the corresponding axis: pos=0 means to insert at the very beginning.
>> Negative indices indicate locations after the corresponding axis: pos=-1 
>> means
>> to insert at the very end. 0 and -1 are always guaranteed to work. Any
>> other number will depend on the dimensions of the existing array. Default
>> is 0.
>> Returns*res  *ndarray
>> An array with res.ndim >= ndim. A view is returned for array inputs.
>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of
>> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions
>> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) 
>> becomes
>> a view of shape (M, N, 1, 1)when ndim=4.
>> *See also*
>> atleast_1d
>> 
>> , atleast_2d
>> 
>> , atleast_3d
>> 
>> *Notes*
>> This function does not follow the convention of the other atleast_*d 
>> functions
>> in numpy in that it only accepts a single array argument. To process
>> multiple arrays, use a comprehension or loop around the function call. See
>> examples below.
>> Setting pos=0 is equivalent to how the array would be interpreted by
>> numpy’s 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-10 Thread Stephan Hoyer
On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias 
wrote:

> I totally agree with the namespace clutter concern, but honestly, I would
> use `atleast_nd` with its `pos` argument (I might rename it to `position`,
> `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I
> had no idea where the new axes would end up.
>
> So, I’m in favour of including it, and optionally deprecating
> `atleast_{1,2,3}d`.
>
>
I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`,
but I don't think "better" than a pattern we would not recommend is a good
enough reason for inclusion in NumPy. It needs to stand on its own.

What would be the recommended use-cases for this new function?
Have any libraries building on top of NumPy implemented a version of this?


> Juan.
>
> On 11 Feb 2021, at 9:48 am, Sebastian Berg 
> wrote:
>
> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>
> I've created PR#18386 to add a function called atleast_nd to numpy and
> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
> atleast_3d functions.
>
> I proposed a similar idea about four and a half years ago:
> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html,
> PR#7804. The reception was ambivalent, but a couple of folks have asked me
> about this, so I'm bringing it back.
>
> Some pros:
>
> - This closes issue #12336
> - There are a couple of Stack Overflow questions that would benefit
> - Been asked about this a couple of times
> - Implementation of three existing atleast_*d functions gets easier
> - Looks nicer that the equivalent broadcasting and reshaping
>
> Some cons:
>
> - Cluttering up the API
> - Maintenance burden (but not a big one)
> - This is just a utility function, which can be achieved through
> broadcasting and reshaping
>
>
> My main concern would be the namespace cluttering. I can't say I use even
> the `atleast_2d` etc. functions personally, so I would tend to be slightly
> against the addition. But if others land on the "useful" side here (and it
> seemed a bit at least on github), I am also not opposed.  It is a clean
> name that lines up with existing ones, so it doesn't seem like a big
> "mental load" with respect to namespace cluttering.
>
> Bike shedding the API is probably a good idea in any case.
>
> I have pasted the current PR documentation (as html) below for quick
> reference. I wonder a bit about the reasoning for having `pos` specify a
> value rather than just a side?
>
>
>
> numpy.atleast_nd(*ary*, *ndim*, *pos=0*)
> View input as array with at least ndim dimensions.
> New unit dimensions are inserted at the index given by *pos* if necessary.
> Parameters*ary  *array_like
> The input array. Non-array inputs are converted to arrays. Arrays that
> already have ndim or more dimensions are preserved.
> *ndim  *int
> The minimum number of dimensions required.
> *pos  *int, optional
> The index to insert the new dimensions. May range from -ary.ndim - 1 to
> +ary.ndim (inclusive). Non-negative indices indicate locations before the
> corresponding axis: pos=0 means to insert at the very beginning. Negative
> indices indicate locations after the corresponding axis: pos=-1 means to
> insert at the very end. 0 and -1 are always guaranteed to work. Any other
> number will depend on the dimensions of the existing array. Default is 0.
> Returns*res  *ndarray
> An array with res.ndim >= ndim. A view is returned for array inputs.
> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of
> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions
> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) 
> becomes
> a view of shape (M, N, 1, 1)when ndim=4.
> *See also*
> atleast_1d
> 
> , atleast_2d
> 
> , atleast_3d
> 
> *Notes*
> This function does not follow the convention of the other atleast_*d functions
> in numpy in that it only accepts a single array argument. To process
> multiple arrays, use a comprehension or loop around the function call. See
> examples below.
> Setting pos=0 is equivalent to how the array would be interpreted by
> numpy’s broadcasting rules. There is no need to call this function for
> simple broadcasting. This is also roughly (but not exactly) equivalent to
> np.array(ary, copy=False, subok=True, ndmin=ndim).
> It is easy to create functions for specific dimensions similar to the other
>  atleast_*d functions using Python’s functools.partial
>  
> function.
> An example is shown below.
> *Examples*
>
> >>> np.atleast_nd(3.0, 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-10 Thread Juan Nunez-Iglesias
I totally agree with the namespace clutter concern, but honestly, I would use 
`atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, 
or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea 
where the new axes would end up.

So, I’m in favour of including it, and optionally deprecating 
`atleast_{1,2,3}d`.

Juan.

> On 11 Feb 2021, at 9:48 am, Sebastian Berg  wrote:
> 
> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
>> I've created PR#18386 to add a function called atleast_nd to numpy and
>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and
>> atleast_3d functions.
>> 
>> I proposed a similar idea about four and a half years ago:
>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html 
>> ,
>> PR#7804. The reception was ambivalent, but a couple of folks have asked me
>> about this, so I'm bringing it back.
>> 
>> Some pros:
>> 
>> - This closes issue #12336
>> - There are a couple of Stack Overflow questions that would benefit
>> - Been asked about this a couple of times
>> - Implementation of three existing atleast_*d functions gets easier
>> - Looks nicer that the equivalent broadcasting and reshaping
>> 
>> Some cons:
>> 
>> - Cluttering up the API
>> - Maintenance burden (but not a big one)
>> - This is just a utility function, which can be achieved through
>> broadcasting and reshaping
>> 
> 
> My main concern would be the namespace cluttering. I can't say I use even the 
> `atleast_2d` etc. functions personally, so I would tend to be slightly 
> against the addition. But if others land on the "useful" side here (and it 
> seemed a bit at least on github), I am also not opposed.  It is a clean name 
> that lines up with existing ones, so it doesn't seem like a big "mental load" 
> with respect to namespace cluttering.
> 
> Bike shedding the API is probably a good idea in any case.
> 
> I have pasted the current PR documentation (as html) below for quick 
> reference. I wonder a bit about the reasoning for having `pos` specify a 
> value rather than just a side?
> 
> 
> 
> numpy.atleast_nd(ary, ndim, pos=0)
> View input as array with at least ndim dimensions.
> New unit dimensions are inserted at the index given by pos if necessary.
> Parameters
> ary  array_like
> The input array. Non-array inputs are converted to arrays. Arrays that 
> already have ndim or more dimensions are preserved.
> ndim  int
> The minimum number of dimensions required.
> pos  int, optional
> The index to insert the new dimensions. May range from -ary.ndim - 1 to 
> +ary.ndim (inclusive). Non-negative indices indicate locations before the 
> corresponding axis: pos=0 means to insert at the very beginning. Negative 
> indices indicate locations after the corresponding axis: pos=-1 means to 
> insert at the very end. 0 and -1 are always guaranteed to work. Any other 
> number will depend on the dimensions of the existing array. Default is 0.
> Returns
> res  ndarray
> An array with res.ndim >= ndim. A view is returned for array inputs. 
> Dimensions are prepended if pos is 0, so for example, a 1-D array of shape 
> (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended 
> if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of 
> shape (M, N, 1, 1)when ndim=4.
> See also
> atleast_1d 
> ,
>  atleast_2d 
> ,
>  atleast_3d 
> 
> Notes
> This function does not follow the convention of the other atleast_*d 
> functions in numpy in that it only accepts a single array argument. To 
> process multiple arrays, use a comprehension or loop around the function 
> call. See examples below.
> Setting pos=0 is equivalent to how the array would be interpreted by numpy’s 
> broadcasting rules. There is no need to call this function for simple 
> broadcasting. This is also roughly (but not exactly) equivalent to 
> np.array(ary, copy=False, subok=True, ndmin=ndim).
> It is easy to create functions for specific dimensions similar to the other 
> atleast_*d functions using Python’s functools.partial 
>  
> function. An example is shown below.
> Examples
> >>> np.atleast_nd(3.0, 4)
> array( 3.)
> >>> x = np.arange(3.0)
> >>> np.atleast_nd(x, 2).shape
> (1, 3)
> >>> x = np.arange(12.0).reshape(4, 3)
> >>> np.atleast_nd(x, 5).shape
> (1, 1, 1, 4, 3)
> >>> np.atleast_nd(x, 5).base is x.base
> True
> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
> [array([[1, 2]]), array([[1, 

Re: [Numpy-discussion] ENH: Proposal to add atleast_nd function

2021-02-10 Thread Sebastian Berg
On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote:
> I've created PR#18386 to add a function called atleast_nd to numpy
> and
> numpy.ma. This would generalize the existing atleast_1d, atleast_2d,
> and
> atleast_3d functions.
> 
> I proposed a similar idea about four and a half years ago:
> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html
> ,
> PR#7804. The reception was ambivalent, but a couple of folks have
> asked me
> about this, so I'm bringing it back.
> 
> Some pros:
> 
> - This closes issue #12336
> - There are a couple of Stack Overflow questions that would benefit
> - Been asked about this a couple of times
> - Implementation of three existing atleast_*d functions gets easier
> - Looks nicer that the equivalent broadcasting and reshaping
> 
> Some cons:
> 
> - Cluttering up the API
> - Maintenance burden (but not a big one)
> - This is just a utility function, which can be achieved through
> broadcasting and reshaping
> 

My main concern would be the namespace cluttering. I can't say I use
even the `atleast_2d` etc. functions personally, so I would tend to be
slightly against the addition. But if others land on the "useful" side
here (and it seemed a bit at least on github), I am also not opposed.
 It is a clean name that lines up with existing ones, so it doesn't
seem like a big "mental load" with respect to namespace cluttering.

Bike shedding the API is probably a good idea in any case.

I have pasted the current PR documentation (as html) below for quick
reference. I wonder a bit about the reasoning for having `pos` specify
a value rather than just a side?



numpy.atleast_nd(ary, ndim, pos=0)
View input as array with at least ndim dimensions.
New unit dimensions are inserted at the index given by pos if
necessary.
Parameters
ary  array_like
The input array. Non-array inputs are converted to arrays. Arrays that
already have ndim or more dimensions are preserved.

ndim  int
The minimum number of dimensions required.

pos  int, optional
The index to insert the new dimensions. May range from -ary.ndim -
 1 to +ary.ndim (inclusive). Non-negative indices indicate locations
before the corresponding axis: pos=0 means to insert at the very
beginning. Negative indices indicate locations after the corresponding
axis: pos=-1 means to insert at the very end. 0 and -1 are always
guaranteed to work. Any other number will depend on the dimensions of
the existing array. Default is 0.



Returns
res  ndarray
An array with res.ndim >= ndim. A view is returned for array inputs.
Dimensions are prepended if pos is 0, so for example, a 1-D array of
shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions
are appended if pos is -1, so for example a 2-D array of
shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4.



See also
atleast_1d, atleast_2d, atleast_3d


Notes
This function does not follow the convention of the
other atleast_*d functions in numpy in that it only accepts a single
array argument. To process multiple arrays, use a comprehension or loop
around the function call. See examples below.
Setting pos=0 is equivalent to how the array would be interpreted by
numpy’s broadcasting rules. There is no need to call this function for
simple broadcasting. This is also roughly (but not exactly) equivalent
to np.array(ary, copy=False, subok=True, ndmin=ndim).
It is easy to create functions for specific dimensions similar to the
other atleast_*d functions using Python’s functools.partial function.
An example is shown below.
Examples
>>> np.atleast_nd(3.0, 4)
array( 3.)
>>> x = np.arange(3.0)
>>> np.atleast_nd(x, 2).shape
(1, 3)
>>> x = np.arange(12.0).reshape(4, 3)
>>> np.atleast_nd(x, 5).shape
(1, 1, 1, 4, 3)
>>> np.atleast_nd(x, 5).base is x.base
True
>>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:
[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])]
>>> np.atleast_nd((1, 2), 5, pos=0).shape
(1, 1, 1, 1, 2)
>>> np.atleast_nd((1, 2), 5, pos=-1).shape
(2, 1, 1, 1, 1)
>>> from functools import partial
>>> atleast_4d = partial(np.atleast_nd, ndim=4)
>>> atleast_4d([1, 2, 3])
1, 2, 3




signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion