On Fri, Nov 6, 2020 at 4:03 PM Zimmermann Klaus <klaus.zimmerm...@smhi.se> wrote:
> Hi, > > On 06/11/2020 15:58, Ralf Gommers wrote: > > On Fri, Nov 6, 2020 at 9:51 AM Zimmermann Klaus > > <klaus.zimmerm...@smhi.se <mailto:klaus.zimmerm...@smhi.se>> wrote: > > I have absolutely no problem keeping this out of the main namespace. > > > > In fact I'd like to point out that it was not my idea. Rather, it was > > proposed by Bas van Beek in the comments [1,2] and received a little > > more scrutiny from Eric Wieser in [3]. > > > > Thanks, between two PRs with that many comments, I couldn't figure that > > out - just saw the commit that make the change. > > Understandable, no worries. > > > On the subject matter, I am also curious about the potential for > > confusion. What other behavior could one expect from a sliding window > > view with this shape? > > > > As I said, I am completely fine with keeping this out of the main > > namespace, but I agree with Sebastian's comment, that > > `np.lib.stride_tricks` is perhaps not the best namespace. > > > > > > I agree that that's not a great namespace. There's multiple issues with > > namespaces, we basically have three good ones (fft, linalg, random) and > > a bunch of other ones that range from questionable to terrible. See > > > https://github.com/numpy/numpy/blob/master/numpy/tests/test_public_api.py#L127 > > < > https://github.com/numpy/numpy/blob/master/numpy/tests/test_public_api.py#L127 > > > > for details. > > > > This would be a good thing to work on - making the `numpy.lib` namespace > > not bleed into `numpy` via `import *` is one thing to do there, and > > there's many others. But given backwards compat constraints it's not > easy. > > I understand cleaning up all the namespaces is a giant task, so far, far > out of scope here. As said before, I also completely agree to keep it > out of the main namespace (though I will still argue below :P). > > I was just wondering if, of the top your head, an existing, better fit > comes to mind? > Not really. Outside of stride_tricks there's nothing that quite fits. This function is more in scope for something like scipy.signal. Cheers, Ralf > > The reason > > from my point of view is that stride tricks is really a technical > (and > > slightly ominous) name that might throw of more application oriented > > programmers from finding and using this function. Thinking of my > > scientist colleagues, I think those are exactly the kind of users > that > > could benefit from such a prototyping tool. > > > > > > That phrasing is one of a number of concerns. NumPy is normally not in > > the business of providing things that are okay as a prototyping tool, > > but are potentially extremely slow (as pointed out in the Notes section > > of the docstring). A function like that would basically not be the right > > tool for almost anything in, e.g., SciPy - it requires an iterative > > algorithm. In NumPy we don't prefer performance at all costs, but in > > general it's pretty decent rather than "Numba or Cython may gain you > > 100x here". > > I still think that the performance concern is a bit overblown. Yes, > application with large windows can need more FLOPs by an equally large > factor. But most such applications will use small to moderate windows. > Furthermore, this view focuses only on FLOPs. In my current field of > climate science (and many others), that is almost never the limiting > factor. Memory demands are far more problematic and incidentally, those > are more likely to increase in other methods that require the storage of > ancillary, temporary data. > > > Other issues include: > > 2) It is very specific to NumPy's memory model (as pointed out by you > > and Sebastian) - just like the rest of stride_tricks > Not wrong, but on the other hand, that memory model is not exotic. C, > Fortran, and any number of other languages play very nicely with this, > just as important downstream libraries like dask. > > > 3) It has "view" in the name, which doesn't quite make sense for the > > main namespace (also connected to point 2 above). > Ok. > > > 4) The cost of putting something in the main namespace for other > > array/tensor libraries is large. Maybe other libraries, e.g. CuPy, Dask, > > TensorFlow, PyTorch, JAX, MXNet, aim to reimplement part or all of the > > main NumPy namespace as well as possible. This would trigger discussions > > and likely many person-weeks of work for others. > Agreed. Though I have to say that my whole motivation comes from > corresponding issues in dask that where specifically waiting for (the > older version of) this PR (see [1, 2,...]). But I understand that dask > is effectively much closer to the numpy memory model than, say, CuPy, so > don't take this to mean it should be in the main namespace. > > > 5) It's a useful function, but it's very much on the margins of NumPy's > > scope. It could easily have gone into, for example, scipy.signal. At > > this point the bar for functions going into the main namespace should > be> (and is) high. > I agree that the bar for the main namespace should be high! > > > All this taken together means it's not even a toss-up for me. If it were > > just one or two of these points, maybe. But given all the above, I'm > > pretty confident saying "it does not belong in the main namespace". > Again, I am happy with that. > > > Thanks for your thoughts and work! I really appreciate it! > > Cheers > Klaus > > [1] https://github.com/dask/dask/issues/4659 > [2] https://github.com/pydata/xarray/issues/3608 > [3] https://github.com/pandas-dev/pandas/issues/26959 > > > > > > > > Cheers > > Klaus > > > > > > > > [1] https://github.com/numpy/numpy/pull/17394#issuecomment-700998618 > > <https://github.com/numpy/numpy/pull/17394#issuecomment-700998618> > > [2] https://github.com/numpy/numpy/pull/17394#discussion_r498215468 > > <https://github.com/numpy/numpy/pull/17394#discussion_r498215468> > > [3] https://github.com/numpy/numpy/pull/17394#discussion_r498724340 > > <https://github.com/numpy/numpy/pull/17394#discussion_r498724340> > > > > On 06/11/2020 01:39, Sebastian Berg wrote: > > > On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote: > > >> On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote: > > >>> On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers < > > >>> ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>> > > >>> wrote: > > >>> > > >>>> On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg < > > >>>> sebast...@sipsolutions.net <mailto:sebast...@sipsolutions.net>> > > >>>> wrote: > > >>>> > > >>>>> Hi all, > > >>>>> > > >>>>> just a brief note that I merged this proposal: > > >>>>> > > >>>>> https://github.com/numpy/numpy/pull/17394 > > <https://github.com/numpy/numpy/pull/17394> > > >>>>> > > >>>>> adding `np.sliding_window_view` into the 1.20 release of NumPy. > > >>>>> > > >>>>> There was only one public API change, and that is that the > > >>>>> `shape` > > >>>>> argument is now called `window_shape`. > > >>>>> > > >>>>> This is still a good time for feedback in case you have a > > >>>>> better > > >>>>> idea > > >>>>> e.g. for the function or parameter names. > > >>>>> > > >>>> > > >>>> The old PR had this in the lib.stride_tricks namespace. Seeing > it > > >>>> in the > > >>>> main namespace is unexpected and likely will lead to > > >>>> issues/questions, > > >>>> given that such an overlapping view is going to do behave in > ways > > >>>> the > > >>>> average user will be surprised by. It may also lead to requests > > >>>> for > > >>>> other > > >>>> array/tensor libraries to implement this. I don't see any > > >>>> discussion on > > >>>> this in PR 17394, it looks like a decision by the PR author that > > >>>> no > > >>>> one > > >>>> commented on - reconsider that? > > >>>> > > >>>> Cheers, > > >>>> Ralf > > >>>> > > >>> > > >>> +1 let's keep this in the lib.stride_tricks namespace. > > >>> > > >> > > >> I have no reservations against having it in the main namespace > and am > > >> happy either way (it can still be exposed later in any case). It > is > > >> the > > >> conservative choice and maybe it is an uncommon enough function > that > > >> it > > >> deserves being a bit hidden... > > > > > > > > > In any case, its the safe bet for NumPy 1.20 at least so I opened > > a PR: > > > > > > https://github.com/numpy/numpy/pull/17720 > > <https://github.com/numpy/numpy/pull/17720> > > > > > > Name changes, etc. are also possible of course. > > > > > > I still think it might be nice to find a better place for this > type of > > > function that `np.lib.stride_tricks` though, but dunno... > > > > > > - Sebastian > > > > > > > > > > > >> > > >> But I am curious, it sounds like you have both very strong > > >> reservations, and I would like to understand them better. > > >> > > >> The behaviour can be surprising, but that is why the default is a > > >> read- > > >> only view. I do not think it is worse than `np.broadcast_to` in > this > > >> regard. (It is nowhere near as dangerous as `as_strided`.) > > >> > > >> It is true that it is specific to NumPy (memory model). So that is > > >> maybe a good enough reason right now. But I am not sure that > > >> stuffing > > >> things into a pretty hidden `np.lib.*` namespaces is a great long > > >> term > > >> solution either. There is very little useful functionality hidden > > >> away > > >> in `np.lib.*` currently. > > >> > > >> Cheers, > > >> > > >> Sebastian > > >> > > >>>> > > >>>> > > >>>>> Cheers, > > >>>>> > > >>>>> Sebastian > > >>>>> > > >>>>> > > >>>>> > > >>>>> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > > >>>>>> Hello, > > >>>>>> > > >>>>>> I would like to draw the attention of this list to PR #17394 > > >>>>>> [1] that > > >>>>>> adds the implementation of a sliding window view to numpy. > > >>>>>> > > >>>>>> Having a sliding window view in numpy is a longstanding open > > >>>>>> issue > > >>>>>> (cf > > >>>>>> #7753 [2] from 2016). A brief summary of the discussions > > >>>>>> surrounding > > >>>>>> it > > >>>>>> can be found in the description of the PR. > > >>>>>> > > >>>>>> This PR implements a sliding window view based on stride > > >>>>>> tricks. > > >>>>>> Following the discussion in issue #7753, a first > > >>>>>> implementation > > >>>>>> was > > >>>>>> provided by Fanjin Zeng in PR #10771. After some discussion, > > >>>>>> that PR > > >>>>>> stalled and I picked up the issue in the present PR #17394. > > >>>>>> It > > >>>>>> is > > >>>>>> based > > >>>>>> on the first implementation, but follows the changed API as > > >>>>>> suggested > > >>>>>> by > > >>>>>> Eric Wieser. > > >>>>>> > > >>>>>> Code reviews have been provided by Bas van Beek, Stephen > > >>>>>> Hoyer, > > >>>>>> and > > >>>>>> Eric > > >>>>>> Wieser. Sebastian Berg added the "62 - Python API" label. > > >>>>>> > > >>>>>> > > >>>>>> Do you think this is suitable for inclusion in numpy? > > >>>>>> > > >>>>>> Do you consider the PR ready? > > >>>>>> > > >>>>>> Do you have suggestions or requests? > > >>>>>> > > >>>>>> > > >>>>>> Thanks for your time and consideration! > > >>>>>> Klaus > > >>>>>> > > >>>>>> > > >>>>>> [1] https://github.com/numpy/numpy/pull/17394 > > <https://github.com/numpy/numpy/pull/17394> > > >>>>>> [2] https://github.com/numpy/numpy/issues/7753 > > <https://github.com/numpy/numpy/issues/7753> > > >>>>>> _______________________________________________ > > >>>>>> NumPy-Discussion mailing list > > >>>>>> NumPy-Discussion@python.org <mailto: > NumPy-Discussion@python.org> > > >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > <https://mail.python.org/mailman/listinfo/numpy-discussion> > > >>>>>> > > >>>>> > > >>>>> _______________________________________________ > > >>>>> NumPy-Discussion mailing list > > >>>>> NumPy-Discussion@python.org <mailto: > NumPy-Discussion@python.org> > > >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > <https://mail.python.org/mailman/listinfo/numpy-discussion> > > >>>>> > > >>>> _______________________________________________ > > >>>> NumPy-Discussion mailing list > > >>>> NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org > > > > >>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > <https://mail.python.org/mailman/listinfo/numpy-discussion> > > >>>> > > >>> > > >>> _______________________________________________ > > >>> NumPy-Discussion mailing list > > >>> NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org> > > >>> https://mail.python.org/mailman/listinfo/numpy-discussion > > <https://mail.python.org/mailman/listinfo/numpy-discussion> > > >> > > >> _______________________________________________ > > >> NumPy-Discussion mailing list > > >> NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org> > > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > <https://mail.python.org/mailman/listinfo/numpy-discussion> > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org> > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > <https://mail.python.org/mailman/listinfo/numpy-discussion> > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org> > > https://mail.python.org/mailman/listinfo/numpy-discussion > > <https://mail.python.org/mailman/listinfo/numpy-discussion> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion