[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-17 Thread Stephen J. Turnbull
Steven D'Aprano writes:

 > [Aside: despite what the Zen says, I think *protocols* are far more 
 > important to Python than *namespaces*.]

I think you misread the Zen. :-)

That-is-my-opinion-I-do-not-however-speak-for-its-author-ly y'rs,
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2YNQ72AENBVBTMQYDP4ZV5S4MG6VA32Z/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-16 Thread Andrew Barnert via Python-ideas
On May 15, 2020, at 21:35, Steven D'Aprano  wrote:
> 
> On Fri, May 15, 2020 at 05:44:59PM -0700, Andrew Barnert wrote:
> 
>> Once you go with a separate view-creating function (or type), do we even 
>> need the dunder?
> 
> Possibly not. But the beauty of a protocol is that it can work even if 
> the object doesn't define a `__view__` dunder.

Sure, but if there’s no good reason for any class to provide a __view__ dunder, 
it’s better not to call one.

Which is why I asked—in the message you’re replying to—a bunch of questions to 
try to determine whether there’s any reason for a class to want to provide an 
override. I’m not going to repeat the whole thing here; it’s all still in that 
same message you replied to.

> - If the object defines `__view__`, call it; this allows objects to 
> return an optimized view, if it makes sense to them; e.g. bytes 
> might simply return a memoryview.

Not if memoryview doesn’t have the right API, as we discussed earlier in this 
thread.

But more importantly, if it’s only builtins that will likely ever need an 
optimization, we can do that inside the functions. That’s exactly what we do in 
hundreds of places already. Even the one optimization that’s exposed as part of 
the public C API, PySequence_Fast, isn’t hookable, much less all the functions 
that fast-path directly on the array in list/tuple or on the split hash table 
in set/dict/dict_keys and so on. It seems to work well enough in practice, and 
it’s simpler, and faster for the builtins, and it means we don’t have hundreds 
of extra dunders (and type slots in CPython) that will almost never be used, 
and PyPy doesn’t need to write hooks that are actually pessimizations just 
because they’re optimizations in CPython, and so on.

Of course there might be a reason that doesn’t apply in this case (there 
obviously is a good reason for non-builtin types to optimize __contains__, for 
example), but “there might be” isn’t an answer to YAGNI. Especially if we can 
add the dunder later if someone later finds a need for it.

And honestly, I’m not sure even list and tuple are worth optimizing here. After 
all, you can’t do the index arithmetic and call to sq_ifem significantly faster 
than a generic C function; it only helps if you can avoid the call to sq_item, 
and I think we can’t do that in any of the most useful cases (at least not 
without patching up a whole lot more code than we want). But I’ll try it and 
see if I’m wrong.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JWBWCVKBBZMKGGMR6UQDP5ZII4NN6IWM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-16 Thread Andrew Barnert via Python-ideas
On May 15, 2020, at 21:25, Steven D'Aprano  wrote:
> 
> On Fri, May 15, 2020 at 01:00:09PM -0700, Christopher Barker wrote:
> 
>> I know you winked there, but frankly, there isn't a clear most Pythonic API
>> here. Surely you do'nt think PYhton should have no methods?
> 
> That's not what I said. Of course Python should have methods -- it's an 
> OOP language after all, and it's pretty hard to have objects unless they 
> have behaviour (methods). Objects with no behaviour are just structs.
> 
> But seriously, and this time no winking, Python's design philosophy is 
> very different from that of Java and even Ruby and protocols are a 
> hugely important part of that. Python without protocols wouldn't be 
> Python, and it would be a much lesser language.
> 
> [Aside: despite what the Zen says, I think *protocols* are far more 
> important to Python than *namespaces*.]

I agree up to this point. But what you’re missing is that Python (even with 
stdlib stuff like pickle/copy and math.floor) has only a couple dozen 
protocols, and hundreds and hundreds of methods.

Some things should be protocols, but not everything should, or even close. Very 
few things should be protocols. More to the point, things should be protocols 
if and only if they have a specific reason to be a protocol. For example:

1. You need something more complicated than just a single straightforward call, 
like the fallback behavior for __contains__ and __iter__ with “old-style 
sequences”, or the whole pickle__getnewargs_ex__ and friends, or __add__ vs. 
__radd__.

2. Syntax, especially operator overloading, like __contains__ and __add__.

3. The function is so ubiquitously important that you don’t want anything else 
using the same name for different meanings, like __len__.

(There are probably other good reasons.)

When you have a reason like this, you should design a protocol. But when you 
don’t, dot syntax is the default. And it’s not just complexity, or “too many 
builtins” (after all, pickle.dump and math.ceil aren’t builtins). It’s that dot 
syntax gives you built-in disambiguation that function call syntax doesn’t. If 
I have a sequence, xs.index(x) has an obvious meaning. But index(xs, x) would 
not, because means too many different things (in fact, we already have an 
__index__ protocol that does one of those different things), and it’s not like 
len where one of those meanings is so fundamental that we a actually want to 
discourage all the others.

As I said elsewhere, I think we probably can’t have dot syntax in this case for 
other reasons. But that _still_ doesn’t necessarily mean we need a protocol. If 
we need to be able to override behavior but we can’t have dot syntax, *that* 
might be a good reason for a protocol, but either of those on its own is not a 
good reason, only the combination.

It’s worth comparing C++, where “free functions are part of a class’s 
interface”. They don’t spell their protocols with underscores, or call them 
protocols, but they idea is all over the place. x+y tries x.operator+(y) plus 
various fallbacks. The way you get an iterator is begin(xs) which by default 
calls xs.begin() so that’s the standard place to customize it but there are 
fallbacks. Converting a C to a D tries (among other things) both C::operator 
D() and D::D(C). And so on. But, unlike Python, they don’t try to distinguish 
what is and isn’t a protocol; the dogma is basically that everything should be 
a protocol if it possibly can be. Which doesn’t work. They keep trying to solve 
the compiler-ambiguity problem by adding features like argument-dependent 
lookup, and almost adding D’s uniform call syntax every 3 years, but none of 
that will ever solve the human-ambiguity problem. Things like + and begin and 
swap belong at the top level because they should always mean the same thing 
even if they have to be implemented differently, but things like draw should be 
methods because they mean totally different things on different types, and even 
if the compiler can tell which one is meant, even if an IDE
can help you, deck.draw(5) vs. shape.draw(ctx) is still more readable than 
draw(deck, 5) vs. draw(shape, ctx). Ultimately, it’s just as bad as Java; it 
just goes too far in the opposite direction, which is still too far, and that’s 
what always happens when you’re looking for a perfect and simple dogma that 
applies to both iter and index so you never have to think about design.

> Python tends to use protocol-based top-level functions:
> 
>   len, int, str, repr, bool, iter, list
> 
> etc are all based on *protocols*, not inheritance.

> The most notable 
> counter-example to that was `iterator.next` which turned out to be a 
> mistake and was changed in Python 3 to become a protocol based on a 
> dunder.

No, the most notable counter examples are things like insert, extend, index, 
count, etc. on sequences; keys, items, update, setdefault, etc. on mappings; 
add, isdisjoint, etc. on sets; real, imag, etc. on numbers; 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Steven D'Aprano
On Fri, May 15, 2020 at 05:44:59PM -0700, Andrew Barnert wrote:

> Once you go with a separate view-creating function (or type), do we even need 
> the dunder?

Possibly not. But the beauty of a protocol is that it can work even if 
the object doesn't define a `__view__` dunder.

- If the object defines `__view__`, call it; this allows objects to 
  return an optimized view, if it makes sense to them; e.g. bytes 
  might simply return a memoryview.

- If not, fall back on a generic view object that just does index 
  arithmetic and delegation.



-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4R5CRI2MJP3R4PZPDRTDZFYOCDPSP3I2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Steven D'Aprano
On Fri, May 15, 2020 at 01:00:09PM -0700, Christopher Barker wrote:

> I know you winked there, but frankly, there isn't a clear most Pythonic API
> here. Surely you do'nt think PYhton should have no methods?

That's not what I said. Of course Python should have methods -- it's an 
OOP language after all, and it's pretty hard to have objects unless they 
have behaviour (methods). Objects with no behaviour are just structs.

But seriously, and this time no winking, Python's design philosophy is 
very different from that of Java and even Ruby and protocols are a 
hugely important part of that. Python without protocols wouldn't be 
Python, and it would be a much lesser language.

[Aside: despite what the Zen says, I think *protocols* are far more 
important to Python than *namespaces*.]

Python tends to have shallow inheritance hierarchies; Java has deep 
ones. Likewise Ruby tends to have related classes inherit from 
generic superclasses that provide default implementations.

In we were like Ruby, there would be no problem: we'd just add a view 
method to something like object.Collections.Sequence and instantly all 
lists, tuples, range objects, strings, bytes, bytearrays etc would have 
that method. But we're not. In practice, each type would have to 
implement it's own view method.

Python tends to use protocol-based top-level functions:

len, int, str, repr, bool, iter, list

etc are all based on *protocols*, not inheritance. The most notable 
counter-example to that was `iterator.next` which turned out to be a 
mistake and was changed in Python 3 to become a protocol based on a 
dunder.

That's not to say that methods aren't sometimes appropriate, or that 
there may not be grey areas where we could go either way. But in 
general, the use of protocols is such a notable part of Python, and 
so unusual in other OOP languages, that it trips up newcomers often 
enough that there is a FAQ about it:

https://docs.python.org/3/faq/design.html#why-does-python-use-methods-for-some-functionality-e-g-list-index-but-functions-for-other-e-g-len-list

although the answer is woefully incomplete. See here for a longer 
version:

http://effbot.org/pyfaq/why-does-python-use-methods-for-some-functionality-e-g-list-index-but-functions-for-other-e-g-len-list.htm

There is a *lot* of hate for Python's use of protocols, especially among 
people who have drunk the "not real object oriented" Koolaid, e.g. see 
comments here:

https://stackoverflow.com/questions/237128/why-does-python-code-use-len-function-instead-of-a-length-method

where this is described as "moronic". Let me be absolutely clear here: 
the use of protocols, as Python does, is a *brilliant* design, not a 
flaw, and in my opinion the haters are falling into the Blub trap:

http://paulgraham.com/avg.html

Using protocols looks moronic to them because they haven't seen how they 
add more power to the language and the coder. All they see are the ugly 
underscores. Why write a `__len__` method instead of a `len` method? 
There's no difference except four extra characters.

That's some real Blub thinking right there.

Unfortunately, len() hardly takes advantage of the possibilities of 
protocols, so it's an *obvious* example but not a *good* example. Here's 
a better example:

py> class NoContains:
... def __getitem__(self, idx):
... if idx < 10:
... return 1000+idx
... raise IndexError
...
py> 1005 in NoContains()
True

I wrote a class that doesn't define or inherit a `__contains__` method, 
but I got support for the `in` operator for free just by supporting 
subscripting. If you don't understand protocols, this is just weird. But 
that's your loss, not a design flaw.

Another good example is `next()`. When I write an iterator class, I can 
supply a `__next__` dunder. All it needs to do is provide the next 
value.

I've never needed to add support for default values in a `__next__` 
method, because the builtin `next()` handles it for me:

_SENTINEL = object()
try:
...
except StopIteration:
if default is not _SENTINEL:
return default
raise

I get support for default values for free, thanks to the use of a 
protocol. If this were Python 2, with a `next` method, I'd have needed 
to write those six lines a couple of hundred times so far in my life, 
plus tests, plus documentation. Multiply that by tens of thousands of 
Python coders.

Some day, if the next() builtin grows new functionality to change the 
exception raised:

next(iterator, raise_instead=ValueError)

not one single iterator class out of a million in the world will need to 
change a single line of code in order to get the new functionality.

This is amazingly powerful stuff when handled properly, and len() is 
perhaps the most boring and trivial example of it.

I'm going to be provocative: if (generic) you are not blown away by 
the possibilities of protocols, you don't understand them.




[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Andrew Barnert via Python-ideas
On May 15, 2020, at 18:21, Christopher Barker  wrote:
> 
> Hmm, more thought needed.

Speaking of “more thought needed”, I took a first pass over cleaning up my 
quick slice view class and adding the slicer class, and found some 
bikesheddable options. I think in most cases the best answer is obvious, but 
I’ve been wrong before. :)

Assume s and t are Sequences of the same type, u is a Sequence or a different 
type, and vs, vt, and vu are view slices on those sequences. Also assume that 
we called the view slicer type vslice, and the view slice type SliceView, 
although obviously those are up for bikeshedding.

When s==t is allowed, is vs==vt? What about vs==t? Same for <, etc.? I think 
yes, yes, yes.

When s is hashable, is vs hashable? If so, is it the same hash an equivalent 
copy-slice would have? The answer to == constrains the answer here, of course. 
I think they can just not be hashable, but it’s a bit weird to have an 
immutable builtin sequence that isn’t. (Maybe hash could be left out but then 
added in a future version if there’s a need?)

When s+t is allowed, is vs+t? vs+vt? (Similarly when s+u is allowed, but that 
usually isn’t.) vs*3? I think all yes, but I’m not sure. (Imagine you create a 
million view slices but filter them down to just 2, and then concatenate those 
two. That makes sense, I think.)

Should there be a way to ask vs for the corresponding regular copy slice? Like 
vslice(s)[10:].strictify() == s[10:]? I’m not sure what it’s good for, but 
either __hash__ or __add__ seems to imply a private method for this, and then I 
can’t see any reason to prevent people from calling it. (Except that I can’t 
think of a good name.)

Should the underlying sequence be a public attribute? It seems easy and 
harmless and potentially useful, and memoryview has .obj (although dict views 
don’t have a public reference to the dict).

What about the original slice object? This seems less useful, since you don’t 
pass around slice objects that often. And we may not actually be storing it. 
(The simplest solution is to store slice.indices(len(seq)) instead of slice.) 
So I think no.

If s isn’t a Sequence, should vslice(s) be a TypeError. I think we want the C 
API sequence check, but not the full ABC check.

What does vslice(s)[1] do? I think TyoeError('not a slice').

Does the vslice type need any other methods besides __new__ and __getitem__? I 
don’t think so. The only use for vslice(s) besides slicing it is stashing it to 
be sliced later, just like the only use for a method besides calling it is 
stashing it to be called later. But it should have the sequence as a public 
attribute for debugging/introspection, just like methods make their self and 
function attributes public. 

Is the SliceView type public? (Only in types?) Or is “what the vslice slicer 
factory creates” an implementation detail, like list_iter. I think the latter.

What’s the repr for a SliceView? Something like vslice([1, 2, 10, 20])[::2] 
seems most useful, since that’s the way you construct it, even if it is a bit 
unusual. Although a tiny slice of a giant sequence would then have a giant repr.

What’s the str? I think same as the repr, but will people expect a view of a 
list/tuple/etc. to look “nice” like list/tuple/etc. do?

Does vs[:] return self? (And, presumably, vs[0:len(s)+100] and so on.) I think 
so, but that doesn’t need to be guaranteed (just like tuple, range, etc.).

If vs is an instance of a subclass of SliceView, is vs[10:20] a SliceView, or 
an instance of the subclass? I think the base class, just like tuple, etc.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/X45QVVPMB5JOQDKI7OEV4JAQ7WMA4XHO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Andrew Barnert via Python-ideas
On May 15, 2020, at 18:21, Christopher Barker  wrote:
> 
>> On Fri, May 15, 2020 at 5:45 PM Andrew Barnert  wrote:
> 
>> On May 15, 2020, at 13:03, Christopher Barker  wrote:
>> > 
>> > Taking all that into account, if we want to add "something" to Sequence 
>> > behavior (in this case a sequence_view object), then adding a dunder is 
>> > really the only option -- you'd need a really compelling reason to add a 
>> > Sequence method, and since there are quite a few folks that think that's 
>> > the wrong approach anyway, we don't have a compelling reason.
>> > 
>> > So IF a sequence_view is to be added, then a dunder is really the only 
>> > option.
>> 
>> Once you go with a separate view-creating function (or type), do we even 
>> need the dunder?
> 
> Indeed -- maybe not. We'd need a dunder if we wanted to make it an "official" 
> part of the Sequence protocol/ABC, but as you point out there may be no need 
> to do that at all.

That’s actually a what triggered this thought. We need collections.abc.Sequence 
to support the dunder with a default implementation so code using it as a mixin 
works. What would that default implementation be? Basically just a class whose 
__getitem__ constructs the thing I posted earlier and that does nothing else. 
And why would anyone want to override that default?

Being able to override dunders like __in__ and regular methods like count is 
useful for multiple reasons: a string-like class needs to extend their behavior 
for substring searching, a range-like class can implement them without 
searching at all, etc. But none of those seemed to apply to overriding 
__viewslice__ (or whatever we’d call it).

> Hmm, more thought needed.

Yeah, certainly just because I couldn’t think of a use doesn’t mean there isn’t 
one.

But if I’m right that the dunder could be retrofitted in later (I want to try 
building an implementation without the dunder and then retrofitting one in 
along with a class that overrides it, if I get the time this weekend, to verify 
that it really isn’t a problem), that seems like a much better case for leaving 
it out.

Another point: now that we’re thinking generic function (albeit maybe a C 
builtin with fast-path code for list/tuple), maybe it’s worth putting an 
implementation on PyPI as soon as possible, so we can get some experience using 
it and make sure the design doesn’t have any unexpected holes and, if we’re 
lucky, get some uptake from people outside this thread.___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VSGQLYF6B25BB6KLZALMYST7IQWMVI3I/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Christopher Barker
On Fri, May 15, 2020 at 5:45 PM Andrew Barnert  wrote:

> On May 15, 2020, at 13:03, Christopher Barker  wrote:
> >
> > Taking all that into account, if we want to add "something" to Sequence
> behavior (in this case a sequence_view object), then adding a dunder is
> really the only option -- you'd need a really compelling reason to add a
> Sequence method, and since there are quite a few folks that think that's
> the wrong approach anyway, we don't have a compelling reason.
> >
> > So IF a sequence_view is to be added, then a dunder is really the only
> option.
>
> Once you go with a separate view-creating function (or type), do we even
> need the dunder?
>

Indeed -- maybe not. We'd need a dunder if we wanted to make it an
"official" part of the Sequence protocol/ABC, but as you point out there
may be no need to do that at all.

Hmm, more thought needed.

-CHB


>
> I’m pretty sure a generic slice-view-wrapper (that just does index
> arithmetic and delegates) will work correctly on every sequence type. I
> won’t promise that the one I posted early in this thread does, of course,
> and obviously we need a bit more proof than “I’m pretty sure…”, but can
> anyone think of a way a Sequence could legally work that would break this?
>
> And I can’t think of any custom features a Sequence might want add to its
> view slices (or its view-slice-making wrapper).
>
> I can definitely see how a custom wrapper for list and tuple could be
> faster, and imagine how real life code could use it often enough that this
> matters. But if it’s just list and tuple, CPython’s already full of
> builtins that fast-path on list and tuple, and there’s no reason this one
> can’t do the same thing.
>
> So, it seems like it only needs a dunder if there are likely to be
> third-party classes that can do view-slicing significantly faster than a
> generic view-slicer, and are used in code where it’s likely to matter. Can
> anyone think of such a case? (At first numpy seems like an obvious answer.
> Arrays aren’t Sequences, but I think as long as the wrapper doesn’t
> actually type-check that at __new__ time they’d work anyway. But why would
> anyone, especially when they care about speed, use a generic viewslice
> function on a numpy array instead of just using numpy’s own view slicing?)
>
> It seems like a dunder is something that could be added as a refinement in
> the next Python version, if it turns out to be needed. If so, then, unless
> we have an example in advance to disprove the YAGNI presumption, why not
> just do it without the dunder?
>
>
>

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6OZYSHPJOAJWDHQAR5VITAP5KEPSVUKF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Andrew Barnert via Python-ideas
On May 15, 2020, at 13:03, Christopher Barker  wrote:
> 
> Taking all that into account, if we want to add "something" to Sequence 
> behavior (in this case a sequence_view object), then adding a dunder is 
> really the only option -- you'd need a really compelling reason to add a 
> Sequence method, and since there are quite a few folks that think that's the 
> wrong approach anyway, we don't have a compelling reason.
> 
> So IF a sequence_view is to be added, then a dunder is really the only option.

Once you go with a separate view-creating function (or type), do we even need 
the dunder?

I’m pretty sure a generic slice-view-wrapper (that just does index arithmetic 
and delegates) will work correctly on every sequence type. I won’t promise that 
the one I posted early in this thread does, of course, and obviously we need a 
bit more proof than “I’m pretty sure…”, but can anyone think of a way a 
Sequence could legally work that would break this?

And I can’t think of any custom features a Sequence might want add to its view 
slices (or its view-slice-making wrapper).

I can definitely see how a custom wrapper for list and tuple could be faster, 
and imagine how real life code could use it often enough that this matters. But 
if it’s just list and tuple, CPython’s already full of builtins that fast-path 
on list and tuple, and there’s no reason this one can’t do the same thing.

So, it seems like it only needs a dunder if there are likely to be third-party 
classes that can do view-slicing significantly faster than a generic 
view-slicer, and are used in code where it’s likely to matter. Can anyone think 
of such a case? (At first numpy seems like an obvious answer. Arrays aren’t 
Sequences, but I think as long as the wrapper doesn’t actually type-check that 
at __new__ time they’d work anyway. But why would anyone, especially when they 
care about speed, use a generic viewslice function on a numpy array instead of 
just using numpy’s own view slicing?)

It seems like a dunder is something that could be added as a refinement in the 
next Python version, if it turns out to be needed. If so, then, unless we have 
an example in advance to disprove the YAGNI presumption, why not just do it 
without the dunder?

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G3L6NP4PWPR2O2VSVXGGJNALYECKDG5G/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Andrew Barnert via Python-ideas
On May 15, 2020, at 03:50, Steven D'Aprano  wrote:
> 
> On Thu, May 14, 2020 at 09:47:36AM -0700, Andrew Barnert wrote:
>>> On May 14, 2020, at 03:01, Steven D'Aprano  wrote:
>>> 
>> Which is exactly why Christopher said from the start of this thread, 
>> and everyone else has agreed at every step of the way, that we can’t 
>> change the default behavior of slicing, we have to instead add some 
>> new way to specifically ask for something different.
> 
> Which is why I was so surprised that you suddenly started talking about 
> not being able to insert into a slice of a list rather than a view.

We’re talking about slice views. The sentence you quoted and responded to was 
about the difference between a slice view from a list and a slice view from a 
string. A slice view from a list may or may not be the same type as a slice 
view from a tuple (I don’t think there’s a reason to care whether they are or 
not), but either way, it being immutable will, I think, not surprise anyone. By 
contrast, a slice view from a string being not stringy _might_ surprise someone.

>> Not only that, but whatever gives 
>> you view-slicing must look sufficiently different that you notice the 
>> difference—and ideally that gives you something you can look up if you 
>> don’t know what it means. I think lst.view[10:20] fits that bill.
> 
> Have we forgotten how to look at prior art all of a sudden? Suddenly 
> been possessed by the spirits of deceased Java and Ruby programmers 
> intent on changing the look and feel of Python to make it "real object 
> oriented"? *wink*

No, we have remembered that language design is not made up of trivial rules 
like “functions good, methods bad”, but of understanding the tradeoffs and how 
they apply in each case. 

> We have prior art here:
> 
>b'abcd'.memoryview  # No, not this.
>memoryview(b'abcd')  # That's the one.

>'abcd'.iter  # No, not that either.
>iter('abcd')  # That's it
> 
> In fairness, I do have to point out that dict views do use a method 
> interface,

This is a secondary issue that I’ll come back to, but first: the whole thing 
that this started off with is being able to use slicing syntax even when you 
don’t want a copy.

The parallel to the prior art is obvious:

itertools.islice(seq, 10, 20) # if you don’t care about iterator or view
sliceviews.slice(seq, 10, 20) # if you do

The first one already exists. The second one takes 15 lines of code, which I 
slapped together and posted near the start of the thread.

The only problem is that they don’t solve the problem of “use slicing syntax”. 
But if that’s the entire point of the proposal (at least for Chris), that’s a 
pretty big problem.

Now, as we’d already been discussing (and as you quoted), you _could_ have a 
callable like this:

viewslice(seq)[10:20]

I can write that in only a few more lines than what I posted before, and it 
works. But it’s no longer parallel to the prior art. It’s not a function that 
returns a view, it’s a wrapper object that can be sliced to provide a view. 
There are pros and cons of this wrapper object vs. the property, but a false 
parallel with other functions is not one of them.

> 1. Dict views came with a lot of backwards-compatibility baggage; 
> they were initially methods that returned lists; then methods 
> that returned iterators were added, then methods that returned 
> views were added, and finally in 3.x the view methods were renamed and 
> the other six methods were removed.

This is, if anything, a reason they _shouldn’t_ have been methods. Changing the 
methods from 2.6 to 2.7 to 3.x, and in a way that tools like six couldn’t even 
help without making all of your code a bit uglier, was bad, and wouldn’t have 
been nearly as much of a problem if we’d just made them all functions in 2.6.

And yet, the reasons for them being methods were compelling enough that they 
remain methods in 3.x, despite that problem. That’s how tradeoffs work.

> 2. There is only a single builtin mapping object, dict, not like 
> sequences where there are lists, tuples, range objects, strings, byte 
> strings and bytearrays.

Well. there’s also mappingproxy, which is a builtin even if its name is only 
visible in types. And there are other mappings in the stdlib, as well as 
popular third-party libraries like SortedContainers. And they all support these 
methods. There are some legacy third-party libraries never fully updated for 
3.x still out there, but they don’t meet the Mapping protocol or its ABC.

So, how does this distinction matter?

Note that there is a nearly opposite argument for the wrapper object that 
someone already made that both seem a lot compelling to me: third-party types. 
We can’t change them overnight. And some of them might already have an 
attribute named view, or anything else we might come up with. Those are real 
negatives with the property design, in a way that “more of the code we _can_ 
easily change is in the Objects rather than Lib 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Christopher Barker
TL;DR: no need to come to consensus about the most "Pythonic" API for a
sequence view -- due to potential name clashes, adding a dunder is pretty
much the only option.

Details below:

On Fri, May 15, 2020 at 3:50 AM Steven D'Aprano  wrote:

>  I think lst.view[10:20] fits that bill.
>
> Have we forgotten how to look at prior art all of a sudden? Suddenly
> been possessed by the spirits of deceased Java and Ruby programmers
> intent on changing the look and feel of Python to make it "real object
> oriented"? *wink*
>

I know you winked there, but frankly, there isn't a clear most Pythonic API
here. Surely you do'nt think PYhton should have no methods?


> We have prior art here:
> b'abcd'.memoryview  # No, not this.
> memoryview(b'abcd')  # That's the one.
>

That's not the ideal example, memoryviews are a real oddball -- they are
designed very much to be supported by third party applications. And a
memoryview object is a type, unlike, say, len(). Though I guess we'd be
talking about a "view" type here as well.

'abcd'.iter  # No, not that either.
> iter('abcd')  # That's it.
>

That's closer -- it certainly could have been added to the Iterable ABC.

>  In fairness, I do have to point out that dict views do use a method

> interface, but:
>
> 1. Dict views came with a lot of backwards-compatibility baggage;
>

I think this is really the key point here. Not so much the "baggage", but
the fact that dicts (and therefor Mappings) have always had .keys, .values,
and .items.

so adding the views didn't add or remove any new attributes,

2. There is only a single builtin mapping object, dict, not like
> sequences where there are lists, tuples, range objects, strings, byte
> strings and bytearrays.
>

True, but there are multiple Mapping objects in the Standard Library, at it
is the intenet that the Mapping ABCs can be used by third party classes --
so I'm not sure that is such a big distinction.

3. Dicts need three kinds of view, keys/items/values, not just one;
> adding three new builtin functions just for dicts is perhaps a bit
> excessive.
>

Well, one could have created a MappingView object with three attributes.
And maybe a full MApping view would be useful, though I can't think of a
use case at the moment.

1. No backwards-compatibility baggage; we can pick the interface which
> is the most Pythonic. That's a protocol based on a dunder, not a
> method.
>

I disagree here, but come to the same conclusion: adding an attribute to
the Sequence ABC will break backward compatibility -- any Sequence subclass
that already has an attribute with that name would break.

We can all argue about what the most Pythonic API is, but the fact is that
Python has both "OO" APIs and "function-based" APIs. So either one could be
acceptable. But when adding a new name, there is a different impact
depending on what namespace it is added to:

A) Adding a reserved word is a Really Big Deal -- only done when absolutely
necessary. (and completely off the table for this one)

B) Adding a name to an ABC is a Big Deal -- it could potentially invalidate
any subclasses of that ABC -- so suddenly subclasses that worked perfectly
fine would be broken. And in the case at hand, numpy arrays do, in fact,
already have a .view method that is not the same thing.

C) Adding a builtin name is a Medium Deal, but not too huge -- existing
code might overwrite it, but that's only an issue if they want to use the
new functionality.

E) Adding a new name to a standard library module is Small Deal -- no third
parties should be adding stuff to that namespace anyway (and import * is
not recommended) (not that adding new functionality to the stdlib isn't a
lift -- but I'm only talking about names now)

F) Adding a new dunder is a Medium Deal -- the dunder names are explicitly
documented as being reserved -- so while folks may (and do) use dunder
names in third party libraries, it's there problem if something breaks
later on. (for instance, numpy used a lot of dunders - though AFAICT, they
are all "__array_*", so kinda a numpy namespace.

Taking all that into account, if we want to add "something" to Sequence
behavior (in this case a sequence_view object), then adding a dunder is
really the only option -- you'd need a really compelling reason to add a
Sequence method, and since there are quite a few folks that think that's
the wrong approach anyway, we don't have a compelling reason.

So IF a sequence_view is to be added, then a dunder is really the only
option.

Then we need to decide on where to put the view-creating-function (and what
to call it).

I personally would like to see it as a built in, but I suspect we wont get
a lot of support for that on this list.

-CHB


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-15 Thread Steven D'Aprano
On Thu, May 14, 2020 at 09:47:36AM -0700, Andrew Barnert wrote:
> On May 14, 2020, at 03:01, Steven D'Aprano  wrote:
> > 
> > On Mon, May 11, 2020 at 10:41:06AM -0700, Andrew Barnert via Python-ideas 
> > wrote:
> > 
> >> I think in general people will expect that a slice view on a sequence 
> >> acts like “some kind of sequence”, not like the same kind they’re 
> >> viewing—again, they won’t be surprised if you can’t insert into a 
> >> slice of a list.
> > 
> > o_O
> > 
> > For nearly 30 years, We've been able to insert into a slice of a list. 
> > I'm going to be *really* surprise if that stops working
> 
> Which is exactly why Christopher said from the start of this thread, 
> and everyone else has agreed at every step of the way, that we can’t 
> change the default behavior of slicing, we have to instead add some 
> new way to specifically ask for something different.

Which is why I was so surprised that you suddenly started talking about 
not being able to insert into a slice of a list rather than a view.


> Not only that, but whatever gives 
> you view-slicing must look sufficiently different that you notice the 
> difference—and ideally that gives you something you can look up if you 
> don’t know what it means. I think lst.view[10:20] fits that bill.

Have we forgotten how to look at prior art all of a sudden? Suddenly 
been possessed by the spirits of deceased Java and Ruby programmers 
intent on changing the look and feel of Python to make it "real object 
oriented"? *wink*

We have prior art here:


b'abcd'.memoryview  # No, not this.
memoryview(b'abcd')  # That's the one.

'abcd'.iter  # No, not that either.
iter('abcd')  # That's it.


In fairness, I do have to point out that dict views do use a method 
interface, but:

1. Dict views came with a lot of backwards-compatibility baggage; 
they were initially methods that returned lists; then methods 
that returned iterators were added, then methods that returned 
views were added, and finally in 3.x the view methods were renamed and 
the other six methods were removed.

2. There is only a single builtin mapping object, dict, not like 
sequences where there are lists, tuples, range objects, strings, byte 
strings and bytearrays.

3. Dicts need three kinds of view, keys/items/values, not just one; 
adding three new builtin functions just for dicts is perhaps a bit 
excessive.


So if we're to add a generic sequence view object, none of those factors 
are relevent:

1. No backwards-compatibility baggage; we can pick the interface which 
is the most Pythonic. That's a protocol based on a dunder, not a 
method.

2. At least six builtins, not one.

3. Only one kind of sequence view, not three.


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JZ3CU2KSHEUZUYNFFP2DTZUXQU5KSKWD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Andrew Barnert via Python-ideas
On May 14, 2020, at 11:53, Ricky Teachey  wrote:
> 
>> So that means a view() function (with maybe a different name) -- however, 
>> that brings up the issue of where to put it. I'm not sure that it warrants 
>> being in builtins, but where does it belong? Maybe the collections module? 
>> And I really think the extra import would be a barrier.
>> 
> 
> It occurs to me-- and please quickly shut me down if this is a really dumb 
> idea, I won't be offended-- `memoryview` is already a top-level built-in. I 
> know it has a near completely different meaning with regards to bytes objects 
> than we are talking about with a sequence view object. But could it do double 
> duty as a creator of views for sequences, too?

But bytes and bytearray are Sequences, and maybe other things that support the 
buffer protocol are too.

At first glance, it sounds terrible that the same function gives you a locking 
buffer view for some sequences and an indirect regular sequence view for 
others, and that there’s no way to get the latter for bytes even when you 
explicitly want that. But maybe in practice it wouldn’t be nearly as bad as it 
sounds? I don’t know. It sounds terrible in theory that NumPy arrays are almost 
but not quite Sequences, but in practice I rarely get confused by that. Maybe 
the same would be true here?

There’s also the problem that “memoryview” is kind of a misleading name if you 
apply it to, say, a range instead of a list. But again, I’m not sure how bad 
that would be in practice.___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BMB5DAW67NRODTH46NXIZ55D4VDRBO2Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Andrew Barnert via Python-ideas
On May 14, 2020, at 10:45, Rhodri James  wrote:
> 
> On 14/05/2020 17:47, Andrew Barnert via Python-ideas wrote:
>> Which is exactly why Christopher said from the start of this thread,
>> and everyone else has agreed at every step of the way, that we can’t
>> change the default behavior of slicing, we have to instead add some
>> new way to specifically ask for something different.
> 
> Erm, did someone actually ask for something different?  As far as I can tell 
> the original thread OP was asking for islice-maker objects, which don't 
> require the behaviour of slicing to change at all.  Quite where the demand 
> for slice views has come from I'm not at all clear.

That doesn’t make any difference here.

If you want slicing sequences to return iterators rather than copies, that 
would break way too much code, so it’s not going to happen. A different 
method/property/class/function that gives you iterators would be fine.

If you want slicing sequences to return views rather than copies, that would 
break way too much code, so it’s not going to happen. A different 
method/property/class/function that gives you iterators would be fine.

Which is why nobody has proposed changing what list.__getitem__, etc. will do.

As for where views came from: because they do everything iterators do plus 
things they don’t, and in this case they’re about as easy to implement.

It’s really the same thing as dict.items. People wanted a dict.items that 
didn’t copy the whole thing into a giant list. The first suggestion was for an 
iterator. But that would break too much code, so it couldn’t be done until 3.0. 
But it was still so useful that it was worth having before 3.x, so it was added 
to 2.6 with a distinct name, iteritems. But then people realized they could 
have a view just as easily as an iterator, and it would do more, so that’s what 
actually went into 3.0. And that turned out to be so useful that it was worth 
having before 3.x, so, even though iteritems had already been added in 2.6, it 
was phased out for viewitems in 2.7. 

I’m just trying to jump to the end here. Some of the issues aren’t the same 
(should it be a function or an attribute, is it worth having custom 
implementations for some builtin types, …), but some of them are, so we can 
learn from the past instead of repeating the same process. We can just build 
the equivalent of viewitems right off the bat, and not even think about 
changing plain slicing (because we never want another 3.0 break).

(Of course there may still be good arguments for why this isn’t the same, or 
for why it should end up differently even if it _is_ the same.)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TLTTZXWFP3QM6WRKEGF246RK6WYJSEG7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Ricky Teachey
>
> So that means a view() function (with maybe a different name) -- however,
> that brings up the issue of where to put it. I'm not sure that it warrants
> being in builtins, but where does it belong? Maybe the collections module?
> And I really think the extra import would be a barrier.
>
>
It occurs to me-- and please quickly shut me down if this is a really dumb
idea, I won't be offended-- `memoryview` is already a top-level built-in. I
know it has a near completely different meaning with regards to bytes
objects than we are talking about with a sequence view object. But could it
do double duty as a creator of views for sequences, too?

---
Ricky.

"I've never met a Kentucky man who wasn't either thinking about going home
or actually going home." - Happy Chandler
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UB6ATNLI3LPIAC75PDGLFMD6GACEBV4X/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Andrew Barnert via Python-ideas
On May 14, 2020, at 03:35, Steven D'Aprano  wrote:
> 
> On Sun, May 10, 2020 at 09:36:14PM -0700, Andrew Barnert via Python-ideas 
> wrote:
> 
> 
>>> for i in itertools.seq_view(a_list)[::2]:
>>>...
>>> 
>>> I still think I prefer this though:
>>> 
>>> for i in a_list.view[::2]:
>>>...
> 
>> Agreed. A property on sequences would be best,
> 
> Why?

Because the whole point of this is for something to apply slicing syntax to. 
And compare:

lst.view[10:20]
view(lst)[10:20]
vjew(lst, 10, 20)

The last one is clearly the worst, because it doesn’t let you use slicing 
syntax.

The others are both OK, but the first seems the most readable. I’ll give more 
detailed reasons below. (There may be reasons why it can’t or shouldn’t be 
done, which is why I ranked all of the options in order rather than just 
insisting that we must have the first one or I hate his whole idea.)

> This leads to the same problem that len() solves by being a function, 
> not a method on list and tuple and str and bytes and dict and deque and 
>  Making views a method or property means that every sequence type 
> needs to implement it's own method, or inherit from the same base class, 

But len doesn’t solve that problem at all, and isn’t meant to. It just means 
that every sequence type has to implement __len__ instead of every sequence 
type having to implement len.

Protocols often provide some added functionality. iter() doesn’t just call 
__iter__, it can also fall back to old-style sequence methods, and it has the 
2-arg form. Similarly, str() falls back to __repr__, and has other parameter 
forms, and doubles as the constructor for the string type. And next() even 
changed from being a normal method to a protocol and function, breaking 
backward compatibility, specifically to make it easier to do the 2-arg form.

But len() isn’t like that. There is no fallback, no added behavior, nothing. It 
doesn’t add anything. So why do we have it? Guido’s argument is in the FAQ. It 
starts off with “For some operations, prefix notation just reads better than 
postfix”. He then backs up the general principle that this is sometimes true by 
appeal to math. And then he explains the reasons this is one of those 
operations by arguing that “len”’is the most important piece of information 
here so it belongs first.

It’s the same principle here, but the specific answer is different. View-ness 
is not more important than the sequence and the slicing, so it doesn’t call out 
to be fronted. In fact, view-ness is (at least in the user’s mind) strongly 
tied to the slicing, so it calls out to be near the slice.

And it’s not like this is some unprecedented thing. Most of the collection 
types, and corresponding ABCs, have regular methods as well as protocol 
dunders. Is anyone ever confused by having to write xs.index(x) instead of 
index(xs, x)? I don’t think so. In fact, I think the latter would be _more_ 
confusing, because “index” has so many different meanings that “list.index” is 
useful to nail it down. (Notice that we already _have_ a dunder named 
__index__, and it does something totally different…) And the same is true for 
“view”. In fact, everything in your argument is so generic that it acts as an 
argument against not just .index() but against any public methods or attributes 
on anything. Obviously you didn’t intend it that way, but once you actually 
target it so that it argues against .len() but not .index(), I don’t think 
there’s any argument against .view left.

> and that's why in the Java world nobody agrees what method to call to 
> get the length of an object.

Nobody can agree on what function to call in C or PHP even though they’re 
functions rather than methods in those languages.

Everyone can agree on what method to use in C++ and Smalltalk even though 
they’re methods in those languages, just like Java. (In fact, C++ even loosely 
enforces consistency the same way Python loosely does, except at compile time 
instead of run time—if your class doesn’t have a size() method, it doesn’t duck 
type as a collection and therefore can’t be used in templates that want a 
collection.)

Or just look at Python: nobody is confused about how to spell the .index method 
even though it’s a method.

So the problem in Java has nothing to do with methods. (We don’t have to get 
into what’s wrong with Java here; it’s not relevant.)

> So if we are to have a generic view proxy object, as opposed to the very 
> much non-generic dict views, then it ought to be a callable function 

We don’t actually _know_ how generic it can/should be yet. That’s something 
we’ve been discussing in this thread. It might well be a 
quality-of-implementation issue that has different best answers in different 
Pythons. Or it might not. It’s not obvious. Which implies that whatever the 
answer is, it’s not something that people should have to grasp it to understand 
the feature.

You wouldn’t want to users to base their understanding of iter on 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Rhodri James

On 14/05/2020 17:47, Andrew Barnert via Python-ideas wrote:

Which is exactly why Christopher said from the start of this thread,
and everyone else has agreed at every step of the way, that we can’t
change the default behavior of slicing, we have to instead add some
new way to specifically ask for something different.


Erm, did someone actually ask for something different?  As far as I can 
tell the original thread OP was asking for islice-maker objects, which 
don't require the behaviour of slicing to change at all.  Quite where 
the demand for slice views has come from I'm not at all clear.


--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JHBSBYVEF27GANMTQWIARUUTOASNNPUU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Andrew Barnert via Python-ideas
On May 14, 2020, at 03:01, Steven D'Aprano  wrote:
> 
> On Mon, May 11, 2020 at 10:41:06AM -0700, Andrew Barnert via Python-ideas 
> wrote:
> 
>> I think in general people will expect that a slice view on a sequence 
>> acts like “some kind of sequence”, not like the same kind they’re 
>> viewing—again, they won’t be surprised if you can’t insert into a 
>> slice of a list.
> 
> o_O
> 
> For nearly 30 years, We've been able to insert into a slice of a list. 
> I'm going to be *really* surprise if that stops working

Which is exactly why Christopher said from the start of this thread, and 
everyone else has agreed at every step of the way, that we can’t change the 
default behavior of slicing, we have to instead add some new way to 
specifically ask for something different.

Well, not _jusr_ this. There’s also the fact that for 30 years people have been 
using [:] to mean copy, and the fact that for 30 years people have taken small 
slices of giant lists and then expected the giant lists to get collected, and 
so on. But any one of these is enough reason on its own that copy-slicing must 
remain the default, behavior you get from lst[10:20]. Not only that, but 
whatever gives you view-slicing must look sufficiently different that you 
notice the difference—and ideally that gives you something you can look up if 
you don’t know what it means. I think lst.view[10:20] fits that bill.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FSAZWPEV3LA3K2CP46GMLABDIOCM7FSL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Christopher Barker
On Thu, May 14, 2020 at 3:32 AM Steven D'Aprano  wrote:

> On Sun, May 10, 2020 at 09:36:14PM -0700, Andrew Barnert via Python-ideas
> wrote:
> > > for i in itertools.seq_view(a_list)[::2]:
> > > ...
> > >
> > > I still think I prefer this though:
> > >
> > > for i in a_list.view[::2]:
>
> > Agreed. A property on sequences would be best,
>
> Why?
>
> This leads to the same problem that len() solves by being a function,
> not a method on list and tuple and str and bytes and dict and deque and
>  Making views a method or property means that every sequence type
> needs to implement it's own method, or inherit from the same base class,
> and that's why in the Java world nobody agrees what method to call to
> get the length of an object.
>

I'm not a Java guy -- but I'm not sure that's the problem -- it sounds to
me like Java does have clearly defined Interfaces (that's the Java concept,
yes?) for Sequences, or, in this case, "Sized". Granted, len() as a builtin
pre-dates ABCs, but we do have them now, and we do for a reason.

>
> Python has a long history of using protocols and optional dunders for
> things like this:
>
> https://lucumr.pocoo.org/2011/7/9/python-and-pola/


Nice post, and I agree for teh most part, but frankly, I dont find it
convicing. For example:

"In Ruby collection objects respond to .size. But because it looks better
almost all of them will also respond .length."

And really? Java and Ruby both have these inconsistencies in the standard
library? WTF?

And why do we not have: "In Python, collection objects response to the
len() function. But because it looks better, almost all of them will also
have a .size property. I think it's more about "Python has a long history
of using protocols", which I interpret to mean "people follow standards"
(at least in the standard library!) , rather than "you have to use built in
functions for any operation that might be applicable to more than one type"

And despite that history, the ABCs DO have a few regular methods, and a
bunch of mixins.

And frankly, I don't really see the difference in terms of ease of
implementation -- one way and all Sequences need to implement .view (or use
the ABC mixin), and one they need to implement __view__ (or use the ABC
mixin).

Though as I write this, I realize, of course, that there IS an advantage to
__view__ -- the dunders are a reserved namespace, so no one should have a
custom Sequence that already has a .__view__ dunder. Whereas third party
Sequences *may* already have a .view attribute. Indeed, numpy arrays do,
and it does NOT mean the same thing that this would.

Funny, if I go back to that post, it turns out I didn't find the whole
"Java and Ruby haven't standardize on how to spell length" argument
compelling, but later, he talks about how the dunders are reserved -- and
THAT is, indeed, compelling.

So that means a view() function (with maybe a different name) -- however,
that brings up the issue of where to put it. I'm not sure that it warrants
being in builtins, but where does it belong? Maybe the collections module?
And I really think the extra import would be a barrier.

Going back to the whole functions and protocols vs methods argument, the
fact is that I don't think there IS a clear line between what belongs
where. Let's face it, I don't think any of us would like it if we had to do
something like:

from collections import keys, values

the_keys = keys(a_dict)

And I think we all agree that moving string functionality into str methods
was a really good idea.

In fact, I think that when adding stuff to builtins, particularly ABC, we
are stuck not with deciding where is best belongsl, but simiply with -- it
can't be a method if it wasn't there near the beginning of Python's life.

> So if we are to have a generic view proxy object, as opposed to the very

> much non-generic dict views,


I still don't think that the "genericness" is the point here. A Sequence
view isn't really any more generic than a MappingView, the difference in
API is that Mappings have had .keys() and .values() and .items() forever,
so it was possible to change what they return without potentially breaking
other implementations. (that was also a py2-py3 change, which allowed more
breakage).

It would be kin do like changing what indexing with a slice meant -- but we
are not advocating that!

Growing a language is a challenge!

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/F4R7WQ25ANSTFHRXKO62JSY23LN57AUT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Christopher Barker
On Thu, May 14, 2020 at 2:58 AM Steven D'Aprano  wrote:

> On Mon, May 11, 2020 at 10:41:06AM -0700, Andrew Barnert via Python-ideas
> wrote:
>
> > I think in general people will expect that a slice view on a sequence
> > acts like “some kind of sequence”, not like the same kind they’re
> > viewing—again, they won’t be surprised if you can’t insert into a
> > slice of a list.
>
> o_O
>
> For nearly 30 years, We've been able to insert into a slice of a list.
> I'm going to be *really* surprise if that stops working.
>

At this point, we're thinking a sequence view would be immutable anyway.
Even for views on immutable objects. So That's a non-issue.

In numpy, there really isn't a view object at all -- there are simply numpy
arrays, and any array *may* share the data block with another array. But
they are both "proper" arrays.

In [20]: import numpy as np


In [21]: A = np.ones((5,))


In [22]: B = A[:]


In [23]: type(A)

Out[23]: numpy.ndarray

In [24]: type(B)

Out[24]: numpy.ndarray

There is a small distinction: in the above case, A "owns" the data block,
but you can only tell if you poke into the flags:

In [27]: A.flags

Out[27]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In [28]: B.flags

Out[28]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False
And user code very rarely needs to care about that. That flag is mostly
used to manege the memory, and prevent dangerous operations:

In [30]: A.resize((3,4))

---
ValueErrorTraceback (most recent call last)
 in 
> 1 A.resize((3,4))

ValueError: cannot resize an array that references or is referenced
by another array in this way.

There are good reasons for ndarrays being able to share data while still
being mutable, but I don't think a "normal" Python seqence_view should be
mutable -- it would lead to a lot of confusion.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/O7AEPEGMIXNG3J6AHRSVF2ECQ2G75XW7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Steven D'Aprano
On Sun, May 10, 2020 at 09:36:14PM -0700, Andrew Barnert via Python-ideas wrote:


> > for i in itertools.seq_view(a_list)[::2]:
> > ...
> > 
> > I still think I prefer this though:
> > 
> > for i in a_list.view[::2]:
> > ...
> 

> Agreed. A property on sequences would be best,

Why?

This leads to the same problem that len() solves by being a function, 
not a method on list and tuple and str and bytes and dict and deque and 
 Making views a method or property means that every sequence type 
needs to implement it's own method, or inherit from the same base class, 
and that's why in the Java world nobody agrees what method to call to 
get the length of an object.

Python has a long history of using protocols and optional dunders for 
things like this:

https://lucumr.pocoo.org/2011/7/9/python-and-pola/


So if we are to have a generic view proxy object, as opposed to the very 
much non-generic dict views, then it ought to be a callable function 
which, *if necessary*, delegates to a `__view__` dunder.


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3TLOEJ7HRGVQ75A5IG6QUJTTGNMEUXEG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-14 Thread Steven D'Aprano
On Mon, May 11, 2020 at 10:41:06AM -0700, Andrew Barnert via Python-ideas wrote:

> I think in general people will expect that a slice view on a sequence 
> acts like “some kind of sequence”, not like the same kind they’re 
> viewing—again, they won’t be surprised if you can’t insert into a 
> slice of a list.

o_O

For nearly 30 years, We've been able to insert into a slice of a list. 
I'm going to be *really* surprise if that stops working.



-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Y3ECQDA7P2SYB5CIRWHHLE6YBCUP7RDT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-11 Thread Christopher Barker
On Mon, May 11, 2020 at 11:38 AM Andrew Barnert  wrote:

> On May 11, 2020, at 10:57, Alex Hall  wrote:
>
> 
> On Mon, May 11, 2020 at 12:50 AM Christopher Barker 
> wrote:
>
>
>> Though it is heading in a different direction that where Andrew was
>> proposing, that this would be about making and using views on sequences,
>> which really wouldn't make sense for any iterator.
>>
>
> The idea is that islice would be the default behaviour and classes could
> override that to return views if they want.
>
> I'm still confused about this -- islice returns an iterator that iterates
over the passed-in iterable -- that is standard behvior for most tools in
itertools.

So I ca see that it would be nice to have a slice syntax that would work on
all iterables, not just sequences, but I *think* that's a totally different
idea.

Anyway, thanks all for the input. When get a chance, I'll update my
proposal with the input.

I think I'll go for Andrew's idea of a sequence_view object -- that would
give me my "lazy slice", and other nifty features.

> But maybe doing it _just_ for view slicing, rather than for everything,
and requiring a wrapper object to use it, is a lot simpler, and useful
enough on its own.

I'm not quite sure what a "view for slicing" means, but maybe it's what I'm
thinking about. But I would describe what I'm thinking about is a view
object that you can get with slicing syntax.

There are two key parts here -- we *could* have just an iterator with slice
syntax, and also a view without slice syntax, but Im all for getting them
together.

Again, I welcome PRs on my notes and prototpe code:

https://github.com/PythonCHB/islice-pep

I'd particularly welcome text about the motivation and use-cases for a
sequence view object -- my text is all about only the iterating part.

-CHB




-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BSRBJJHJ4VYV3T6TSNRWCYVVXMLOEJBT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-11 Thread Andrew Barnert via Python-ideas
On May 11, 2020, at 10:57, Alex Hall  wrote:
> 
> 
>> On Mon, May 11, 2020 at 12:50 AM Christopher Barker  
>> wrote:
> 
>  
>> Though it is heading in a different direction that where Andrew was 
>> proposing, that this would be about making and using views on sequences, 
>> which really wouldn't make sense for any iterator.
> 
> The idea is that islice would be the default behaviour and classes could 
> override that to return views if they want.

It is possible to get both, but I don’t think it’s easy.

I think the ultimate unification of these ideas is the “views everywhere” 
design of Swift. Whether you have a sequence or just a collection or just a 
one-shot forward-only iterable, you use the same syntax and the same functions 
to do everything—copy-slicing, view-slicing, chaining, mapping, zipping, etc. 
And the result is always a view with as much functionality as makes sense (do 
filtering a sequence gives you a view that’s a reversible collection, not a 
sequence). So you can view-slice the result of a genexpr the same way you would 
a list, and you just get a forward-only iterable view instead of a full-fledged 
sequence view. I’ve started designing such a thing multiple times, every couple 
years or so, and always realize it’s even more work than I thought and harder 
to fit into Python than i thought and give up.

But maybe doing it _just_ for view slicing, rather than for everything, and 
requiring a wrapper object to use it, is a lot simpler, and useful enough on 
its own.

And that would fit well into the Python way of growing by adding stuff as 
needed, and only trying to come up with a complete and perfect general design 
up front when absolutely necessary.___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4CQE7Q4TRJTQF66ZHMCPJMCLCUEXHEAT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-11 Thread Alex Hall
On Mon, May 11, 2020 at 12:50 AM Christopher Barker 
wrote:

> I'm still confused what you mean by extend to all iterators? you mean that
> you could use slice syntax with anything iterable>
>
> And where does this fit in to the iterable vs iterator continuum?
>
> iterables will return an iterator when iter() is called on them. So are
> you suggesting that another way to get an iterator from an iterable would
> be to pass a slice somehow that would return an iterator off that slice?
>
> so:
>
> for i in an_iterable(a:b:c):
> ...
>
> would work for any iterable? and use an iterator that would iterate as
> specified by the slice?
>

Translate `an_iterable(a:b:c)` to `itertools.islice(an_iterable, a, b, c)`.
>From there your questions can be answered by playing with itertools.islice.
It accepts any iterable or iterator and returns an iterator:

```
import itertools

s = itertools.islice([1, 2, 3], 2)
print(s)
assert s is iter(s)
s2 = itertools.islice(s, 1)
print(s2)
```


> Though it is heading in a different direction that where Andrew was
> proposing, that this would be about making and using views on sequences,
> which really wouldn't make sense for any iterator.
>

The idea is that islice would be the default behaviour and classes could
override that to return views if they want.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DTPLAE5ZQVUTQSTOBKOVPNNMHDZIEPHN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-11 Thread Andrew Barnert via Python-ideas
On May 10, 2020, at 21:51, Christopher Barker  wrote:
> 
> 
> On Sun, May 10, 2020 at 9:36 PM Andrew Barnert  wrote:
> 
>> However, there is one potential problem with the property I hadn’t thought 
>> of until just now: I think people will understand that mylist.view[2:] is 
>> not mutable, but will they understand that mystr.view[2:] is not a string? 
>> I’m pretty sure that isn’t a problem for seqview(mystr)[2:], but I’m not 
>> sure about mystr.view[2:].
> 
> One more issue around the whole "a string is sequence of strings" thing :-) 
> Of course, it *could* be a string -- not much difference with immutables.
> Though I suppose if you took a large slice of a large string, you probably 
> don't want the copy. But what *would* you want to do with it.

That “string is a sequence of strings” issue, plus the “nothing can duck type 
as a string“ issue.

Here’s an example that I can write in, say, Swift or Rust or even C++, but not 
in Python: I mmap a giant mailbox file, and I can treat that as a string 
without copying it anywhere. I split it into a string for each message—I don’t 
want to copy them all into a list of strings, and ideally I don’t even want to 
copy one at a time into an iterator or strings because some of them can be 
pretty huge; I want a list or iterator of views into substrings of the mmap. 
(This isn’t actually a great example, because even with substring views, the 
mmap can’t be used as a str in the first place, but it has the virtue of being 
a real example of code I’ve actually written.)

> but if you had a view of a slice, and it was a proper view, it might be 
> pretty poky for many string operations, so probably just as well not to have 
> them.

I think in general people will expect that a slice view on a sequence acts like 
“some kind of sequence”, not like the same kind they’re viewing—again, they 
won’t be surprised if you can’t insert into a slice of a list. It’s only with 
str that I’m worried they might expect more than we can provide, which sucks 
because str is the one place we _couldn’t_ provide it even if we wanted to.

But maybe I’m wrong and people won’t have this assumption, or will be easily 
cured of it.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CMHJKKVH2TQLED2W3KICEIQY43SBX27S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Ricky Teachey
On Mon, May 11, 2020 at 1:52 AM Ricky Teachey  wrote:

> I have nothing particularly useful to add, only that this is potentially a
> really fantastic idea with a lot of promise, IMO.
>
> It would be nice to have some view objects with a lot of
> functionality that can be sliced not only for efficiency, but for other
> purposes. One might be (note that below I am assuming that slicing a view
> returns another view):
>
> nodes = [(0,0), (1,0), (1,1), (1,0)]
> triangle1 = [view_of_node_idx0, view_of_node_idx1, view_of_node_idx3]
> triangle2 = [view_of_node_idx1, view_of_node_idx2, view_of_node_idx3]
>
> Now if I move the node locations, the triangles reflect the update:
>
> nodes[:] = (1,1), (2,1), (2,2), (2,1)
>

After reading my sent message I decided it probably isn't totally clear:

What I mean here is, using a view to construct other objects, such that
when the viewed object is updated, the objects making use of the views
"see" the update (without having to implement callbacks and observer
patterns are all of that kind of thing).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CG2V527IADMZ66YKBSZ67NPAI3YIV5ZH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Ricky Teachey
I have nothing particularly useful to add, only that this is potentially a
really fantastic idea with a lot of promise, IMO.

It would be nice to have some view objects with a lot of functionality that
can be sliced not only for efficiency, but for other purposes. One might be
(note that below I am assuming that slicing a view returns another view):

nodes = [(0,0), (1,0), (1,1), (1,0)]
triangle1 = [view_of_node_idx0, view_of_node_idx1, view_of_node_idx3]
triangle2 = [view_of_node_idx1, view_of_node_idx2, view_of_node_idx3]

Now if I move the node locations, the triangles reflect the update:

nodes[:] = (1,1), (2,1), (2,2), (2,1)

Even tried implementing something like a simple sequence view myself once,
but got stuck trying to reliably slice slices and couldn't decide what it
should mean to return single values from the view (an atomic "slice"? just
return the value?), and there are probably all kinds of subtleties way
above my knowledge level to consider:

from itertools import islice

class SeqView:
def __init__(self, seq, sl=slice(None)):
self.seq = seq
self.sl = sl
def __repr__(self):
return f"{self.__class__.__name__}({self.seq}, {self.sl})"
def __str__(self):
return f"{self.seq[self.sl]!s}"
def __getitem__(self, key):
if isinstance(key, slice):
return self.__class__(self.seq, )
# even if just returning the value, surely this could be much
better?
return list(islice(self.seq, self.sl.start, self.sl.stop,
self.sl.step))[key]

---
Ricky.

"I've never met a Kentucky man who wasn't either thinking about going home
or actually going home." - Happy Chandler
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BRM3ZXEU6W3YUF47PKGDRZTJ6KJNIF3B/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Christopher Barker
On Sun, May 10, 2020 at 9:36 PM Andrew Barnert  wrote:




Here is where I think you (Andrew) and I (Chris B.) differ in our goals. My
>> goal here is to have an easily accessible way to use the slice syntax to
>> get an iterable that does not make a copy.
>>
>
> It’s just a small difference in emphasis. I want a way to get a
> non-copying slice, and I’d really like it to be easily accessible—I‘d
> grumble if you didn’t make it a member, but I’d still use it.
>

Hmm -- I wasn't sure how key the "slice" part was -- there are, of course,
other uses for views. But we're on the same page as to preferences.


> However, there is one potential problem with the property I hadn’t thought
> of until just now: I think people will understand that mylist.view[2:] is
> not mutable, but will they understand that mystr.view[2:] is not a string?
> I’m pretty sure that isn’t a problem for seqview(mystr)[2:], but I’m not
> sure about mystr.view[2:].
>

One more issue around the whole "a string is sequence of strings" thing :-)
Of course, it *could* be a string -- not much difference with immutables.
Though I suppose if you took a large slice of a large string, you probably
don't want the copy. But what *would* you want to do with it.

but if you had a view of a slice, and it was a proper view, it might be
pretty poky for many string operations, so probably just as well not to
have them.

But notice that if you use the Mapping mixin to define the methods for you,
> it does make sure you get the right views. Maybe that’s sort of a precedent
> for what you’re looking to do?


yup -- that does sound like a similar idea.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QASKMRIRBYIOBBYU7LHJPMJ7PLXOYLJ4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Andrew Barnert via Python-ideas
On May 10, 2020, at 15:39, Christopher Barker  wrote:
> 
> 
>> On Sun, May 10, 2020 at 12:48 PM Andrew Barnert  wrote:
> 
>> Is there any way you can fix the reply quoting on your mail client, or 
>> manually work around it?
> 
> I'm trying -- sorry I've missed a few. It seems more and more "modern" email 
> clients make "interspersed" posting really hard. But I hate bottom posting 
> maybe even more than top posting :-( (gmail seems to have magically gotten 
> worse in this regard recently)

It seems like the one place Google still sees (the remnants of) Yahoo as a 
competitor is who can screw up mailing lists worse.

> It's also interesting to note (from another part of this thread) that slicing 
> isn't part of the Sequence ABC, or any? "official" protocol?

If we still had separate __getitem__ and __getslice__ when ABCs and the idea of 
being clearer about protocols had come along, I’ll bet __getslice__ would have 
been made part of the protocol. But I suppose it’s a little too late for me to 
complain about a change that I think went in even before new-style classes. :)

> I do see this, though not entirely sure what to make of it:
> 
> https://docs.python.org/3/c-api/sequence.html?highlight=sequence

Yeah, the fact that sequences and mappings have identical methods means that 
from Python those two protocols are opt-in rather than automatic, while from C 
you have to be more prepared for errors after checking than with other 
protocols. Annoying, but not using the same syntax and dunders for indexing and 
keying would be a lot more annoying.

> > Also, notice that this is true for all of the existing views, and none of 
> > them try to be un-featureful to avoid it.
> 
> But there is no full featured mapping-view that otherwise acts much like a 
> mapping.

types.MappingProxyType. In most cases, type(self).__dict__ will get you one of 
these.

But of course this is a view of the whole dict, not a subset.

> in theory, there *could* be -- if there was some nice way to specify a subset 
> of a mapping without copying the whole thing -- I can't think of one at the 
> moment.

Not in the stdlib, but for a SortedDict type, key-slicing makes total sense, 
and many of them do it—although coming up with a nice API is hard enough that 
they all seem to do it differently. (Obviously d[lo:hi] should be some iterable 
of the values from the keys lo<=key> I think the biggest question is actually the API. Making this a function (or 
>> a class that most people think of as a function, like most of itertools) is 
>> easy, but as soon as you say it should be a method or property of sequences, 
>> that’s trickier. You can add it to all the builtin sequence types, but 
>> should other sequences in the stdlib have it? Should Sequence provide it as 
>> a mixin? Should it be part of the sequence protocol, and therefore checked 
>> by Sequence as an ABC (even though that could be a breaking change)?
> 
> Here is where I think you (Andrew) and I (Chris B.) differ in our goals. My 
> goal here is to have an easily accessible way to use the slice syntax to get 
> an iterable that does not make a copy.

It’s just a small difference in emphasis. I want a way to get a non-copying 
slice, and I’d really like it to be easily accessible—I‘d grumble if you didn’t 
make it a member, but I’d still use it.

> While we're at it, getting a sequence view that can provide an iterator, and 
> all sorts of other nifty features, is great. But making it a callable in 
> itertools (or any other module) wouldn't accomplish that goal.
> 
> Hmm, but maybe not that bad: 
> 
> for i in itertools.seq_view(a_list)[::2]:
> ...
> 
> I still think I prefer this though:
> 
> for i in a_list.view[::2]:
> ...

Agreed. A property on sequences would be best, a wrapper object that takes 
slice syntax clearly back in second, and a callable that takes only islice 
syntax a very distant third. So if the first one is possible, I’m all for it.

My slices repo provides the islice API just because it’s easier for slapping 
together a proof of concept of the slicing part, definitely not because I’d 
want that added to the stdlib as-is.

However, there is one potential problem with the property I hadn’t thought of 
until just now: I think people will understand that mylist.view[2:] is not 
mutable, but will they understand that mystr.view[2:] is not a string? I’m 
pretty sure that isn’t a problem for seqview(mystr)[2:], but I’m not sure about 
mystr.view[2:].

> So to all those questions: I say "yes" except maybe: 
> 
> "checked by Sequence as an ABC (even though that could be a breaking change)" 
> -- because, well, breaking changes are "Not good".
> 
> I wonder if there is a way to make something standard, but not quite break 
> things -- hmm.
> 
> For instance: It seems to be possible to have Sequence provide it as a mixin, 
> but not have it checked by Sequence as an ABC?

Actually, now that I think about it, Sequence _never_ checks methods. Most of 
the 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Christopher Barker
On Sat, May 9, 2020 at 1:58 PM Alex Hall  wrote:

>
> https://github.com/PythonCHB/islice-pep/blob/master/pep-xxx-islice.rst
>>
>> And the prototype implementation:
>>
>> https://github.com/PythonCHB/islice-pep/blob/master/islice.py
>>
>
>  I think this is a good idea. For sequences I'm not sure how big the
> benefit is - I get that it's more efficient, but I rarely care that much,
> because most lists are small. Why not extend the proposal to all iterators,
> or at least common ones like generators? That would allow avoiding
> itertools when I have no other choice.
>

I'm still confused what you mean by extend to all iterators? you mean that
you could use slice syntax with anything iterable>

And where does this fit in to the iterable vs iterator continuum?

iterables will return an iterator when iter() is called on them. So are you
suggesting that another way to get an iterator from an iterable would be to
pass a slice somehow that would return an iterator off that slice?

so:

for i in an_iterable(a:b:c):
...

would work for any iterable? and use an iterator that would iterate as
specified by the slice?

That is kind of cool.

Though it is heading in a different direction that where Andrew was
proposing, that this would be about making and using views on sequences,
which really wouldn't make sense for any iterator.

And, of course, adding syntax is a heavier lift (though extending a major
built in protocol is not a small lift by any means)

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JCFTPY7U2T3QZGWRG7K4LOJ3XBIYRVZZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Christopher Barker
On Sun, May 10, 2020 at 12:48 PM Andrew Barnert  wrote:

> Is there any way you can fix the reply quoting on your mail client, or
> manually work around it?
>

I'm trying -- sorry I've missed a few. It seems more and more "modern"
email clients make "interspersed" posting really hard. But I hate bottom
posting maybe even more than top posting :-( (gmail seems to have magically
gotten worse in this regard recently)

> If you don’t keep views around, because you’re only using them for more
efficient one-shot iteration, you might never think about that, but then
you’ll never notice it to be surprised by it. The dynamic behavior of dict
views presumably hasn’t ever surprised you in the 12 years it’s worked that
way.

True -- though I the dict views aren't mappings themselves, and thus
*maybe* less useful, but certainly less tempting to use where you might
have otherwise used the original dict or a copy. But sure -- if you have to
go out of your way to get it, then you should know what the implications
are.


> And you probably don't want to lock the "host" anyway -- that could be
>> very confusing if the view is kept all be somewhere far from the code
>> trying to change the sequence.
>>
>

> Yes. I think memoryview’s locking behavior is a special case, not
> something we’d want to emulate here. I’m guessing many people just never
> use memoryview at all, but when you do, you’re generally thinking about raw
> buffers rather than abstract behavior. (It’s right there in the name…) And
> when you need something more featureful than an invisible hard lock on the
> host, it’s time for numpy. :)
>

Yeah, memoryviews are a pretty special case, I don't think they are really
intended to be used much in "user code" rather than libraries with pretty
special cases.

The docs explain it reasonably well. See
> https://docs.python.org/3/glossary.html#term-dictionary-view for the
> basic idea,  https://docs.python.org/3/library/stdtypes.html#dict-views for
> the details on the concrete types, and I think the relevant ABCs and data
> model entries are linked from there.


I was surprised to see that there are ABCs for the Mapping Views as well --
that does make it clear.

The point of collections.abc.Set, and ABCs jn general, and the whole
> concept of protocols, is that the set protocol can be implemented by
> different concrete types—set, frozenset, dict_keys, third-party types like
> sortedcontainers.SortedSet or pyobjc.Foundation.NSSet, etc.—that are
> generally completely unrelated to each other, and implemented in different
> ways—a
>

That I knew -- what surprised me was that the "standard" set methods aren't
part of the ABC.

It's also interesting to note (from another part of this thread) that
slicing isn't part of the Sequence ABC, or any? "official" protocol?

I do see this, though not entirely sure what to make of it:

https://docs.python.org/3/c-api/sequence.html?highlight=sequence


Anyway, a Sequence view is simpler, because it could probably simply be an
immutable sequence -- not much need for contemplating every bit of the API.


It’s really the same thing, it’s just the Sequence protocol rather than the
Set protocol.

Well, dict_keys is a set, and dict_items is *sometimes* a set, and
dict_values is not a set (but is a Sized Collection).

If anything, it’s _less_ simple, because for sequences you have to decide
> whether indexing should work with negative indices, extended slices, etc.,
> which the protocol is silent about. But the answer there is pretty
> easy—unless there’s a good reason not to support those things, you want to
> support them.
>

Agreed -- Protocol or not, the point would be for a sequence_view to be as
must like the built in sequences as possible. And as far as my motivation
for all this goes -- getting that nifty slicing behavior is the main point!


> (The only open question is when you’re designing a sequence that you
> expect to be subclassed, but I don’t think we’re designing for subclassing
> here.)
>

nope.

I do see a possible objection here though. Making a small view of a large
sequence would keep that sequence alive, which could be a memory issue.
Which is one reason why slices don't do that by default.

> But I don’t think it’s a problem for offering an alternative that people
have to explicitly ask for.

Probably not.

> Also, notice that this is true for all of the existing views, and none of
them try to be un-featureful to avoid it.

But there is no full featured mapping-view that otherwise acts much like a
mapping. in theory, there *could* be -- if there was some nice way to
specify a subset of a mapping without copying the whole thing -- I can't
think of one at the moment. But having a view that really does act like the
original Ihtink makes ir more tempting to use more broadly.

But again, you will have had to ask for it.

And it could simply be a buyer beware issue. But the more featureful you
make a view, the more likely it is that they will get used and passed
around 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Andrew Barnert via Python-ideas
On May 10, 2020, at 11:09, Christopher Barker  wrote:

Is there any way you can fix the reply quoting on your mail client, or manually 
work around it? I keep reading paragraphs and saying “why is he saying the same 
thing I said” only to realize that you’re not, that’s just a quote from me that 
isn’t marked, up until the last line where it isn’t…

> On Sat, May 9, 2020 at 9:11 PM Andrew Barnert  wrote:
> 
> > That’s no more of a problem for a list slice view than for any of the 
> > existing views. The simplest way to implement a view is to keep a reference 
> > to the underlying object and delegate to it, which is effectively what the 
> > dict views do.
> 
> Fair enough. Though you still could get potentially surprising behavior if 
> the original sequence's length is changed.

I don’t think it’s surprising. When you go out of your way to ask for a dynamic 
view instead of the default snapshot copy, and then you change the list, you’d 
expect the view to change.

If you don’t keep views around, because you’re only using them for more 
efficient one-shot iteration, you might never think about that, but then you’ll 
never notice it to be surprised by it. The dynamic behavior of dict views 
presumably hasn’t ever surprised you in the 12 years it’s worked that way.

> And you probably don't want to lock the "host" anyway -- that could be very 
> confusing if the view is kept all be somewhere far from the code trying to 
> change the sequence. 

Yes. I think memoryview’s locking behavior is a special case, not something 
we’d want to emulate here. I’m guessing many people just never use memoryview 
at all, but when you do, you’re generally thinking about raw buffers rather 
than abstract behavior. (It’s right there in the name…) And when you need 
something more featureful than an invisible hard lock on the host, it’s time 
for numpy. :)

> I'm still a bit confused about what a dict.* view actually is

The docs explain it reasonably well. See 
https://docs.python.org/3/glossary.html#term-dictionary-view for the basic 
idea,  https://docs.python.org/3/library/stdtypes.html#dict-views for the 
details on the concrete types, and I think the relevant ABCs and data model 
entries are linked from there.

> -- for instance, a dict_keys object pretty much acts like a set, but it isn't 
> a subclass of set, and it has an isdisjoint() method, but not .union or any 
> of the other set methods. But it does have what at a glance looks like pretty 
> complete set of dunders

The point of collections.abc.Set, and ABCs jn general, and the whole concept of 
protocols, is that the set protocol can be implemented by different concrete 
types—set, frozenset, dict_keys, third-party types like 
sortedcontainers.SortedSet or pyobjc.Foundation.NSSet, etc.—that are generally 
completely unrelated to each other, and implemented in different ways—a 
dict_keys is a link to the keys table in a dict somewhere, a set or frozenset 
has its own hash table, a SortedSet has a wide-B-tree-like structure, an NSSet 
is a proxy to an ObjC object, etc. if they all had to be subclasses of set, 
they’d be carrying around a set’s hash table but never using it; they’d have to 
be careful to override every method to make sure it never accidentally got used 
(and what would frozenset or dict_keys override add with?), etc.

And if you look at the ABC, union isn’t part of the protocol, but __or__ is, 
and so on.

> Anyway, a Sequence view is simpler, because it could probably simply be an 
> immutable sequence -- not much need for contemplating every bit of the API.

It’s really the same thing, it’s just the Sequence protocol rather than the Set 
protocol.

If anything, it’s _less_ simple, because for sequences you have to decide 
whether indexing should work with negative indices, extended slices, etc., 
which the protocol is silent about. But the answer there is pretty easy—unless 
there’s a good reason not to support those things, you want to support them. 
(The only open question is when you’re designing a sequence that you expect to 
be subclassed, but I don’t think we’re designing for subclassing here.)

> I do see a possible objection here though. Making a small view of a large 
> sequence would keep that sequence alive, which could be a memory issue. Which 
> is one reason why sliced don't do that by default.

Yes. When you just want to iterate something once, non-lazily, you don’t care 
whether it’s a view of a snapshot, but when you want to keep it around, you do 
care, and you have to decide which one you want. So we certainly can’t change 
the default; that would be a huge but subtle change that would break all kinds 
of code.

But I don’t think it’s a problem for offering an alternative that people have 
to explicitly ask for.

Also, notice that this is true for all of the existing views, and none of them 
try to be un-featureful to avoid it.

> And it could simply be a buyer beware issue. But the more featureful you make 
> a view, the 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Alex Hall
On Sun, May 10, 2020 at 8:20 PM Andrew Barnert  wrote:

> On May 10, 2020, at 02:42, Alex Hall  wrote:
> >
> > - Handling negative indices for sequences (is there any reason we don't
> have that now?)
>
> Presumably partly just to keep it minimal and simple. Itertools is all
> about transforming iterables into other iterables in as generic a way as
> possible. None of the other functions do anything special if given a more
> fully-featured iterable.
>
> But also, negative indexing isn’t actually part of the Sequence protocol.
> (You don’t get negative indexes for free by inheriting Sequence as a mixin,
> nor is it ensured by testing isinstance with Sequence as an ABC.) It’s part
> of the extra stuff that list and the other builtin sequences happen to do.
> You didn’t suggest allowing negative islicing on set even though it could
> just as easily be implemented there, because you don’t expect negative
> indexing as part of the Set protocol (or the Sized Iterable protocol); you
> did expect it as part of the Sequence protocol, but Python’s model
> disagrees.
>

I understand that, but the same could be said about all forms of slicing.
It's not part of the sequence protocol, it's not provided by the ABC, it's
just a nice thing that lists do.

Maybe practicality beats purity here, and islice should take negative
> indices on any Sequence, or even Sized, input, even though that makes it
> different from other itertools functions, and ignores the fact that it
> could be simulating negative indexing on some types where it’s meaningless.
> But how often have you wanted to call islice with a negative index? How
> horrible is the workaround you had to write instead? I suspect that it’s
> already rare enough of a problem that it’s not worth it, and that any form
> of this proposal would make it even rarer, but I could be wrong.
>

You're right, I don't really care about islice accepting negative indices
in isolation. But it's different in the context of my form of this
proposal, where a certain syntax delegates to islice (or something very
close to it) and we want that syntax to support negative indexing.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CMBW3XLEJUMTKDR3BMMDU777GOTKERXX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Andrew Barnert via Python-ideas
On May 10, 2020, at 02:42, Alex Hall  wrote:
> 
> - Handling negative indices for sequences (is there any reason we don't have 
> that now?)

Presumably partly just to keep it minimal and simple. Itertools is all about 
transforming iterables into other iterables in as generic a way as possible. 
None of the other functions do anything special if given a more fully-featured 
iterable.

But also, negative indexing isn’t actually part of the Sequence protocol. (You 
don’t get negative indexes for free by inheriting Sequence as a mixin, nor is 
it ensured by testing isinstance with Sequence as an ABC.) It’s part of the 
extra stuff that list and the other builtin sequences happen to do. You didn’t 
suggest allowing negative islicing on set even though it could just as easily 
be implemented there, because you don’t expect negative indexing as part of the 
Set protocol (or the Sized Iterable protocol); you did expect it as part of the 
Sequence protocol, but Python’s model disagrees.

Maybe practicality beats purity here, and islice should take negative indices 
on any Sequence, or even Sized, input, even though that makes it different from 
other itertools functions, and ignores the fact that it could be simulating 
negative indexing on some types where it’s meaningless. But how often have you 
wanted to call islice with a negative index? How horrible is the workaround you 
had to write instead? I suspect that it’s already rare enough of a problem that 
it’s not worth it, and that any form of this proposal would make it even rarer, 
but I could be wrong.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZGHJJJP43VZI4ZG7PRTIH3GJGTXANJK6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Christopher Barker
On Sat, May 9, 2020 at 9:11 PM Andrew Barnert  wrote:

> I don’t think it invalidates the basic idea at all, just that it suggests
the design should be different.

Originally, dict returned lists for keys, values, and items. In 2.2,
iterator variants were added. In 3.0, the list and iterator variants were
both replaced with view versions, which were enough of an improvement that
they were backported to 2.x. Because a view does cover almost all of the
uses of both a sequence copy and an iterator. And I think the same is true
here.

Probably yes.

I'm inclined to think that it would be a bad idea to have it return a full
sequence view object, and not sure it should do anything other than be
iterable.


Why? What’s the downside to being able to do more with them for the same
performance cost and only a little more up-front design work?

I'm not worried about the design work -- writing a PEP is a LOT more work
than writing the code for this kind of thing :-) And I'll bet folks smarter
than me will want to help out with the code part, if this goes anywhere.

> And this is important here, because a view is what you ideally _want_.
The reason range, key view, etc. are views rather than iterators isn’t that
it’s easier to implement or explain or anything, it’s that it’s a little
harder to implement and explain but so much more useful that it’s worth it.
It’s something people take advantage of all the time in real code.

Maybe -- but "all the time?" I'd venture to say that absolutely the most
common thing done with, e.g. dict.keys() is to iterate over it.


Really? When I just want to iterate over a dict’s keys, I iterate the dict
itself.

True -- I was thinking more of ALL the various "iterables that were
concretized lists in py2" -- dict_keys() is actually uniquie in that dict
itself provides an iterator of the keys. -- I've seen a lot of code like so:

for k in dict.keys():
...

and

if k in dict.keys():


both of which are completely unnecessary. So actually, I'd say that
dict.keys() gets used either less often, or when it's not really needed.
But you're right, given that, when dict_keys is used when it should be, it
would be for other reasons. I"ll bet it's kind of rare though.

And dict_items and dict_values are probably most often as iterables.

> That’s no more of a problem for a list slice view than for any of the
existing views. The simplest way to implement a view is to keep a reference
to the underlying object and delegate to it, which is effectively what the
dict views do.

Fair enough. Though you still could get potentially surprising behavior if
the original sequence's length is changed.

> (You _could_ instead refuse to allow expanding a sequence when there’s a
live view, as bytearray does with memoryview, but I don’t think that’s
necessary here. It’s only needed there a consequence of the fact that the
buffer protocol is provided in C rather than in Python. For a slice view,
it would just make things more complicated and less functional for no good
reason.)

But it would also be, well, weird -- you create a view with a slice if a
given length, and then the underlying sequence is changed, and then your
view object is, well, totally different, it may not even exist (well, be
length-zero, I suppose).

And you probably don't want to lock the "host" anyway -- that could be very
confusing if the view is kept all be somewhere far from the code trying to
change the sequence.

This is all a bitless complicated for a the dict views, becasue none of
them are providing a mapping interface anyway.

The other question is -- should a view of a mutable sequence be mutable
(and mutate the underlying sequence)? That's how numpy arrays work, but it
does require a certain fitness to keep track of.

> But just replacing islice is a much simpler task (mainly because the
input has to be a sequence and the output is always a sequence, so the only
complexity that arises is whether you want to allow mutable views into
mutable sequences), and it may well be useful on its own.

Agreed. And while yes, dict_keys and friends are not JUST iterartors, they
also aren't very functional views, either. They are not sequences,


That’s not true. They are very functional—as functional as reasonably makes
sense. The only reason they’re not Sequences is that they’re views on
dicts, so indexing makes little sense, but set operations do—and they are
in fact Sets. (Except for values.)

certainly not mutabe sequences.


Well, yes, but mutating a dict through its views wouldn’t make sense in the
first place:

>>> d = {1: 2}
>>> k = dict.keys()
>>> k |= 3

not for keys, but it would at least be possible for dict_items, and even
potentially for dict_values, though yes, that would be really confusing.

> So I think it might be better to leave mutation out of the original
version anyway unless someone has a need to it (at which point we can use
the examples to think through the best answers to the design issues).

Yeah, 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-10 Thread Alex Hall
On Sun, May 10, 2020 at 5:00 AM Christopher Barker 
wrote:

> On Sat, May 9, 2020 at 1:58 PM Alex Hall  wrote:
>
>>  I think this is a good idea. For sequences I'm not sure how big the
>> benefit is - I get that it's more efficient, but I rarely care that much,
>> because most lists are small. Why not extend the proposal to all iterators,
>> or at least common ones like generators?
>>
>
> Because the slice syntax, so far at least, only applies to Sequences. And
> in general, you can't use the full slice syntax on iterators anyway (they
> don't have a length).
>

Is that a major problem? Just use the syntax the way you would use
itertools.islice.

Incidentally, I imagine we could allow itertools.islice(iterable, 0, -n) by
iterating ahead by n items, keeping them in memory, and stopping when the
peeking ahead stops. I can understand though that this comes with some
caveats.


> and well, iterators are already iterators ... so there isn't an "extend
> the proposal" here at all.
>

I don't know what you're saying here.


> But without thinking about i much, I'm not sure adding slice syntax to
> iterators in general makes sense -- slicing is quite connected to indexing,
> which iterators don't support.
>

Again, it would make exactly as much sense as itertools.islice.


> > That would allow avoiding itertools when I have no other choice.
>
> reading the thread on adding "strict" to zipk I'd say "avoiding itertools"
> is not really a goal of most folks :-)
>

Most of the points you made in your PEP still apply. itertools.islice is
verbose and not particularly readable. I'm not suggesting doing this just
to make things a little more convenient.


> Perhaps this could come with some new syntax? My first thought was
>> `iterator(1:2)`, the idea being that changing the brackets would give it
>> lazy iterator feel the same way that changing the brackets on a list
>> comprehension turns it into a generator. But it probably looks too much
>> like a function call.
>>
>
> doesn't just look like one -- it would clash. Remember that "iterable" and
> "iterator" is a protocol -- anything can support it. That would make it
> impossible to have a callable be an iterator
>

`iterator(1:2)` isn't a function call, it isn't valid syntax. The colon
would distinguish an islice from a call. But again, while it's techincally
unambiguous, it can still be confusing, and you've proven that point.


> So maybe we can play with double brackets instead:
>>
>
> What would this new syntax do, regardless of what it was? I"m not sure I
> follow. My idea is about creating a view iterable on sequences, I'm not
> sure what view iterable on an iterable would be?
>

`iterator(1:2)` would compile to roughly `itertools.islice(iterator, 1,
2)`, which would handle the rest at runtime:

- Failing if `iterator` isn't actually an iterator.
- Failing if a negative index is used on something without a length.
- Handling negative indices for sequences (is there any reason we don't
have that now?)
- Possibly deferring to a dunder like `__islice__` if one is defined so
that some classes (e.g. lists) can return a clever view or something if
they want.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/77OORLTSODLT73BPP3QENHLYRS3BIPIJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-09 Thread Andrew Barnert via Python-ideas
On May 9, 2020, at 19:43, Christopher Barker  wrote:
> 
> On Sat, May 9, 2020 at 1:03 PM Andrew Barnert  wrote:
> > https://github.com/PythonCHB/islice-pep/blob/master/pep-xxx-islice.rst
> 
> I haven’t read the whole thing yet, but one thing immediately jumped out at 
> me:
> 
> > and methods on containers, such as dict.keys return iterators in Python 3, 
> 
> No they don’t. They return views—objects that are collections in their own 
> right (in particular, they’re not one-shot; they can be iterated over and 
> over) but just delegate to another object rather than storing the data.
> 
> Thanks -- that's that kind of thing that led me to say that this is probably 
> not ready for a PEP.
> 
> but I don't think that invalidates the idea at all -- there is debate about 
> what an "islice" should return, but an iterable view would be a good option.

I don’t think it invalidates the basic idea at all, just that it suggests the 
design should be different.

Originally, dict returned lists for keys, values, and items. In 2.2, iterator 
variants were added. In 3.0, the list and iterator variants were both replaced 
with view versions, which were enough of an improvement that they were 
backported to 2.x. Because a view does cover almost all of the uses of both a 
sequence copy and an iterator. And I think the same is true here.

> I'm inclined to think that it would be a bad idea to have it return a full 
> sequence view object, and not sure it should do anything other than be 
> iterable.

Why? What’s the downside to being able to do more with them for the same 
performance cost and only a little more up-front design work?

> > And this is important here, because a view is what you ideally _want_. The 
> > reason range, key view, etc. are views rather than iterators isn’t that 
> > it’s easier to implement or explain or anything, it’s that it’s a little 
> > harder to implement and explain but so much more useful that it’s worth it. 
> > It’s something people take advantage of all the time in real code.
> 
> Maybe -- but "all the time?" I'd vernture to say that absolutiely the most 
> comon thing done with, e.g. dict.keys() is to iterate over it.

Really? When I just want to iterate over a dict’s keys, I iterate the dict 
itself. 

> > For prior art specifically on slicing as a view, rather than just views in 
> > general, see memoryview (which only works on buffers, not all sequences) 
> > and NumPy (which is weird in many ways, but people rely on slicing giving 
> > you a storage-sharing view)
> 
> I am a long-time numpy user, and yes, I very much take advantage of the 
> memory sharing view.
> 
> But I do not think that that would be a good idea for the standard libary. 
> numpy slices return a full-fledged numpy array, which shares a data view with 
> the it's "host" -- this is really helpful for performance reasons -- moving 
> large blocks of data around is expensive, but it's also pretty confusing. And 
> it would be a lot more problematic with, e.g. lists, as the underlying buffer 
> can be reallocated -- numpy arrays are mutable, but not re-sizable, once 
> you've made one its data buffer does not change.

That’s no more of a problem for a list slice view than for any of the existing 
views. The simplest way to implement a view is to keep a reference to the 
underlying object and delegate to it, which is effectively what the dict views 
do.

(Well, did from 2.x to 3.5. The dict improvements in 3.6 opened up an 
optimization opportunity, because in the split layout a dict is effectively a 
wrapper around a keys view and a separate table, so the keys view can refer 
directly to that thing that already exists. But that isn’t relevant here.)

(You _could_ instead refuse to allow expanding a sequence when there’s a live 
view, as bytearray does with memoryview, but I don’t think that’s necessary 
here. It’s only needed there a consequence of the fact that the buffer protocol 
is provided in C rather than in Python. For a slice view, it would just make 
things more complicated and less functional for no good reason.)

> > But just replacing islice is a much simpler task (mainly because the input 
> > has to be a sequence and the output is always a sequence, so the only 
> > complexity that arises is whether you want to allow mutable views into 
> > mutable sequences), and it may well be useful on its own.
> 
> Agreed. And while yes, dict_keys and friends are not JUST iterartors, they 
> also aren't very functional views, either. They are not sequences, 

That’s not true. They are very functional—as functional as reasonably makes 
sense. The only reason they’re not Sequences is that they’re views on dicts, so 
indexing makes little sense, but set operations do—and they are in fact Sets. 
(Except for values.)

> certainly not mutabe sequences.

Well, yes, but mutating a dict through its views wouldn’t make sense in the 
first place:

>>> d = {1: 2}
>>> k = dict.keys()
>>> k |= 3

You’ve told it to 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-09 Thread Christopher Barker
On Sat, May 9, 2020 at 1:58 PM Alex Hall  wrote:

>  I think this is a good idea. For sequences I'm not sure how big the
> benefit is - I get that it's more efficient, but I rarely care that much,
> because most lists are small. Why not extend the proposal to all iterators,
> or at least common ones like generators?
>

Because the slice syntax, so far at least, only applies to Sequences. And
in general, you can't use the full slice syntax on iterators anyway (they
don't have a length).

and well, iterators are already iterators ... so there isn't an "extend the
proposal" here at all.

But without thinking about i much, I'm not sure adding slice syntax to
iterators in general makes sense -- slicing is quite connected to indexing,
which iterators don't support.

> That would allow avoiding itertools when I have no other choice.

reading the thread on adding "strict" to zipk I'd say "avoiding itertools"
is not really a goal of most folks :-)

You write "This PEP proposes that the sequence protocol be extended". What
> does that mean exactly? I assume you don't want to magically add an
> `islice` property to every class that has `__len__` and `__getitem__`. Will
> you just add it to `collections.abc.Sequence`, the builtins, and the stdlib?
>

Details to be worked out. As Python as evolved over the years from
protocols, to ABC, and it's not fully clear yet, even then, probably, yes
"add it to `collections.abc.Sequence`, the builtins, and the stdlib?

I"m not sure every class that has __len__ and __getitem__ could Magically
grow an new method, and I sure wouldn't want them to. In fact, in theory,
you'd want every class that supports slicing to grow this functionality,
but I don't know there is any way to know whether a class, in general,
supports slicing.

Perhaps this could come with some new syntax? My first thought was
> `iterator(1:2)`, the idea being that changing the brackets would give it
> lazy iterator feel the same way that changing the brackets on a list
> comprehension turns it into a generator. But it probably looks too much
> like a function call.
>

doesn't just look like one -- it would clash. Remember that "iterable" and
"ioterator" is a protocol -- anything can support it. That would make it
impossible to have a callable be an iterator


> So maybe we can play with double brackets instead:
>

What would this new syntax do, regardless of what it was? I"m not sure I
follow. My idea is about creating a view iterable on sequences, I'm not
sure what view iterable on an iterable would be?

-CHB


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6CEBGVDDZIYCIWA2QCBH3G3N3IUFIEBP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-09 Thread Christopher Barker
On Sat, May 9, 2020 at 1:03 PM Andrew Barnert  wrote:
> https://github.com/PythonCHB/islice-pep/blob/master/pep-xxx-islice.rst


I haven’t read the whole thing yet, but one thing immediately jumped out at
me:

> and methods on containers, such as dict.keys return iterators in Python
3,

No they don’t. They return views—objects that are collections in their own
right (in particular, they’re not one-shot; they can be iterated over and
over) but just delegate to another object rather than storing the data.

Thanks -- that's that kind of thing that led me to say that this is
probably not ready for a PEP.

but I don't think that invalidates the idea at all -- there is debate about
what an "islice" should return, but an iterable view would be a good option.

I'm inclined to think that it would be a bad idea to have it return a full
sequence view object, and not sure it should do anything other than be
iterable.

> People also commonly say that range is an iterator instead of a function
that returns a list in Python 3,

Sure, but I don't say that :-) -- a range object is actually s pretty full
immutable sequence -- which is pretty handy. But when people say that, they
are often being careless, rather than wrong.

At least I 'd like to claim that about my saying dict.keys() return an
iterator ;-) -- the point of that part of the document is that many things
in Py3 do NOT return full realized copies, like py2 did.

> And this is important here, because a view is what you ideally _want_.
The reason range, key view, etc. are views rather than iterators isn’t that
it’s easier to implement or explain or anything, it’s that it’s a little
harder to implement and explain but so much more useful that it’s worth it.
It’s something people take advantage of all the time in real code.

Maybe -- but "all the time?" I'd vernture to say that absolutiely the most
comon thing done with, e.g. dict.keys() is to iterate over it. But yes,
having it be a view with other features is handy.

> And this is pretty easy to implement. I have a quick and dirty version at
https://github.com/abarnert/slices, but I think I may have a better version
somewhere with more unit tests.

Thanksl -- I'll take a look.

> For prior art specifically on slicing as a view, rather than just views
in general, see memoryview (which only works on buffers, not all sequences)
and NumPy (which is weird in many ways, but people rely on slicing giving
you a storage-sharing view)

I am a long-time numpy user, and yes, I very much take advantage of the
memory sharing view.

But I do not think that that would be a good idea for the standard libary.
numpy slices return a full-fledged numpy array, which shares a data view
with the it's "host" -- this is really helpful for performance reasons --
moving large blocks of data around is expensive, but it's also pretty
confusing. And it would be a lot more problematic with, e.g. lists, as the
underlying buffer can be reallocated -- numpy arrays are mutable, but not
re-sizable, once you've made one its data buffer does not change.

> The reason I never proposed this for the stdlib (even though that would
allow adding methods directly onto the builtin container types, as your
proposal does) is that I always want to build a _complete_ view library,
with replacements for map, zip, enumerate, all of itertools, etc., and with
enough cleverness to present exactly as much functionality as is possible.

And I have my doubts about it anyway :-)

> But just replacing islice is a much simpler task (mainly because the
input has to be a sequence and the output is always a sequence, so the only
complexity that arises is whether you want to allow mutable views into
mutable sequences), and it may well be useful on its own.

Agreed. And while yes, dict_keys and friends are not JUST iterartors, they
also aren't very functional views, either. They are not sequences,
certainly not mutabe sequences. And:

> (in particular, they’re not one-shot; they can be iterated over and over)

yes, but they are only a single iterator -- if you call iter() on one you
always get the same one back, and it's state is preserved.

So yes, you can iterate over more than once, but iter() only resets after
it's been exhausted before.

In short -- not having thought about it deeply at all, but I'm thinking
that making an SliceIterator very similar to dict_keys and friends would
make a lot of sense.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 

[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-09 Thread Alex Hall
On Sat, May 9, 2020 at 9:41 PM Christopher Barker 
wrote:

> Funny you should bring this up.
>
> I've been meaning, for literally years, to propose not quite this, but
> adding a "slice iterator" to the sequence protocol.
>
> (though note that one alternative is adding slice syntax to
> itertools.islice)
>
> I even got  so far as to write a draft PEP and prototype.
>
> NOTE: I'm not saying this is ready for a PEP, but it was helpful to use
> the format to collect my thoughts.
>
> https://github.com/PythonCHB/islice-pep/blob/master/pep-xxx-islice.rst
>
> And the prototype implementation:
>
> https://github.com/PythonCHB/islice-pep/blob/master/islice.py
>

 I think this is a good idea. For sequences I'm not sure how big the
benefit is - I get that it's more efficient, but I rarely care that much,
because most lists are small. Why not extend the proposal to all iterators,
or at least common ones like generators? That would allow avoiding
itertools when I have no other choice.

You write "This PEP proposes that the sequence protocol be extended". What
does that mean exactly? I assume you don't want to magically add an
`islice` property to every class that has `__len__` and `__getitem__`. Will
you just add it to `collections.abc.Sequence`, the builtins, and the stdlib?

Perhaps this could come with some new syntax? My first thought was
`iterator(1:2)`, the idea being that changing the brackets would give it
lazy iterator feel the same way that changing the brackets on a list
comprehension turns it into a generator. But it probably looks too much
like a function call. So maybe we can play with double brackets instead:

```
import itertools

for (l1, r1), (l2, r2) in itertools.product('() {} []'.split(), repeat=2):
print(f'sequence{l1}{l2}1:2{r2}{r1}')

sequence((1:2))
sequence({1:2})
sequence([1:2])
sequence{(1:2)}
sequence{{1:2}}
sequence{[1:2]}
sequence[(1:2)]
sequence[{1:2}]
sequence[[1:2]]
```
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PV37ISIBDZ63YF44FIBWCFC4XESOA6PL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-09 Thread Ram Rachum
+1

I like this! I never considered this idea. It's a good combination of
efficiency and elegance.

On Sat, May 9, 2020 at 10:41 PM Christopher Barker 
wrote:

> Funny you should bring this up.
>
> I've been meaning, for literally years, to propose not quite this, but
> adding a "slice iterator" to the sequence protocol.
>
> (though note that one alternative is adding slice syntax to
> itertools.islice)
>
> I even got  so far as to write a draft PEP and prototype.
>
> NOTE: I'm not saying this is ready for a PEP, but it was helpful to use
> the format to collect my thoughts.
>
> https://github.com/PythonCHB/islice-pep/blob/master/pep-xxx-islice.rst
>
> And the prototype implementation:
>
> https://github.com/PythonCHB/islice-pep/blob/master/islice.py
>
> I never got around to posting here, as I wasn't quite finished, and was
> waiting 'till I had time to deal with the discussion.
>
> But since it was brought up -- here we go!
>
> If folks have an interest in this, I'd love to get feedback.
>
> -CHB
>
>
>
>
> On Sat, May 9, 2020 at 3:51 AM Chris Angelico  wrote:
>
>> On Sat, May 9, 2020 at 8:00 PM Alex Hall  wrote:
>> >
>> > On Sat, May 9, 2020 at 11:15 AM Ram Rachum  wrote:
>> >>
>> >> Here's an idea I've had. How about instead of this:
>> >>
>> >> itertools.islice(iterable, 7, 20)
>> >>
>> >> We'll just have:
>> >>
>> >> itertools.islice(iterable)[7:20]
>> >>
>> >>
>> >> Advantages:
>> >> 1. More familiar slicing syntax.
>> >> 2. No need to awkwardly use None when you're interested in just
>> specifying the end of the slice without specifying the start, i.e.
>> islic(x)[:10] instead of islice(x, None, 10)
>> >> 3. Doesn't require breaking backwards compatibility.
>> >>
>> >>
>> >> What do you think?
>> >
>> >
>> > Looking at this, my train of thought was:
>> >
>> > While we're at it, why not allow slicing generators?
>>
>> Bear in mind that islice takes any iterable, not just a generator. I
>> don't think there's a lot of benefit in adding a bunch of methods to
>> generator objects - aside from iteration, the only functionality they
>> have is coroutine-based. There's no point implementing half of
>> itertools on generators, while still needing to keep itertools itself
>> for all other iterables.
>>
>> > And if we do that, what about regular indexing?
>> > But then, what if I do `gen[3]` followed by `gen[1]`? Is it an error?
>> Does the generator have to store its past values? Or is `gen[1]` the second
>> item after `gen[3]`? Or wherever the generator last stopped?
>> >
>>
>> It makes no sense to subscript a generator like that.
>>
>> > Well that's probably why I can't index or slice generators - so that
>> code doesn't accidentally make a mess trying to treat a transient iterator
>> the way it does a concrete sequence. A generator says "you can only iterate
>> over me, don't try anything else".
>> >
>> > Which leads us back to your proposal. `islice(iterable)[7:20]` looks
>> nice, but it also allows `foo(islice(iterable))` where `foo` can do its own
>> indexing and that's leading to dangerous territory.
>> >
>>
>> If foo can do its own indexing, it needs to either specify that it
>> takes a Sequence, not just an Iterable, or alternatively it needs to
>> coalesce its argument into a list immediately. If it's documented as
>> taking any iterable, it has to just iterate over it, without
>> subscripting.
>>
>> ChrisA
>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/WADS4DTEVC63P2O2HPEYY5HVXIGWGWMT/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/U35BS5OIDCGTL3PTY4DDOT43V3PN72LQ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6SS7FL3XPPEO43RPRYVJPWNKC5J7LMJV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding slice Iterator to Sequences (was: islice with actual slices)

2020-05-09 Thread Andrew Barnert via Python-ideas
On May 9, 2020, at 12:38, Christopher Barker  wrote:
> 
> https://github.com/PythonCHB/islice-pep/blob/master/pep-xxx-islice.rst

I haven’t read the whole thing yet, but one thing immediately jumped out at me:

> and methods on containers, such as dict.keys return iterators in Python 3, 

No they don’t. They return views—objects that are collections in their own 
right (in particular, they’re not one-shot; they can be iterated over and over) 
but just delegate to another object rather than storing the data.

People also commonly say that range is an iterator instead of a function that 
returns a list in Python 3, and that’s wrong for the same reason.

And this is important here, because a view is what you ideally _want_. The 
reason range, key view, etc. are views rather than iterators isn’t that it’s 
easier to implement or explain or anything, it’s that it’s a little harder to 
implement and explain but so much more useful that it’s worth it. It’s 
something people take advantage of all the time in real code.

And this is pretty easy to implement. I have a quick and dirty version at 
https://github.com/abarnert/slices, but I think I may have a better version 
somewhere with more unit tests.

For prior art specifically on slicing as a view, rather than just views in 
general, see memoryview (which only works on buffers, not all sequences) and 
NumPy (which is weird in many ways, but people rely on slicing giving you a 
storage-sharing view)

The reason I never proposed this for the stdlib (even though that would allow 
adding methods directly onto the builtin container types, as your proposal 
does) is that I always want to build a _complete_ view library, with 
replacements for map, zip, enumerate, all of itertools, etc., and with enough 
cleverness to present exactly as much functionality as is possible. But just 
replacing islice is a much simpler task (mainly because the input has to be a 
sequence and the output is always a sequence, so the only complexity that 
arises is whether you want to allow mutable views into mutable sequences), and 
it may well be useful on its own.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YQKKS4RADWU3QOFWFUU6PHS3ZU523T7P/
Code of Conduct: http://python.org/psf/codeofconduct/