[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Chris Angelico
On Tue, 21 Jun 2022 at 15:48, Brendan Barnwell  wrote:
>
> On 2022-06-20 22:26, Chris Angelico wrote:
> > Okay, here's a compromise.
> >
> > Go and write a full and detailed specification of the Python-visible
> > semantics of deferred evaluation objects, including how they would be
> > used to implement late-bound argument defaults.
> >
> > Go and actually do some real work on your pet feature, instead of
> > using the vapourware to try to shut down the one I've been working on.
> >
> > Go and actually do something useful instead of just arguing that "it
> > must be possible".
>
> I'm not the person you're replying to, but just a reminder here: there
> is one alternative proposal that already has a fully functioning
> implementation, namely the current behavior.  Your arguments against the
> deferred-evaluation proposal seem to constantly be reiterating that
> there is no concrete deferred-evaluation proposal.  You are right.  But
> your arguments also seem to be insinuating that if there is no such
> proposal, then opposition to the PEP is somehow misguided, and that is
> incorrect.  There doesn't need to be any concrete alternative proposal
> other than "leave everything as it is and wait until we think of
> something better".
>
> It is perfectly valid to oppose your PEP even on the basis that maybe 
> a
> deferred-evaluation proposal has a remote possibility of being better in
> the future --- because it is perfectly valid to oppose your PEP even if
> such a proposal has NO possibility of being better in the future.  There
> is no urgency or need for the behavior described in your PEP.  I am fine
> with the current behavior of Python in this regard.  It is not necessary
> to provide any alternative proposal, concrete or handwavy, to argue that
> the PEP is a bad idea.  I believe the PEP is a bad idea because the
> current behavior of Python is actually better than what it would be if
> the PEP were adopted.  I believe it is better to wait until we think of
> a better idea than to implement this PEP, and, if we never think of a
> better idea, then never change the existing argument-default behavior of
> Python.

I have laid out, multiple times, how a deferred evaluation feature is
completely distinct from late-bound argument defaults. So have others.
Steven continues to assert that, just because it MIGHT be possible to
use them in the implementation, we should stop working on this and
start working on that. He would, of course, be very welcome to work on
deferred evaluation himself, but he chooses to hide behind his own
ignorance of C to avoid doing any such work, and then still argues
that we should stop working on this because, in his opinion solely, it
would be more useful to have deferred evaluation.

And then he calls me a liar for saying in the PEP the same thing that
I've been saying here, yet he won't even write up a full specification
for deferred evaluation.

You are welcome to dislike the PEP on the basis that the existing
language is better than it would be with this feature. I personally
disagree, but that's what opinions are. But to object on the mere
basis that something MIGHT, despite demonstrated evidence, be better?
That is unfair and unhelpful.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7SIYYEJFFQ2MW2RKFGUI3HIEKGRJ6WGF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Brendan Barnwell

On 2022-06-20 22:26, Chris Angelico wrote:

Okay, here's a compromise.

Go and write a full and detailed specification of the Python-visible
semantics of deferred evaluation objects, including how they would be
used to implement late-bound argument defaults.

Go and actually do some real work on your pet feature, instead of
using the vapourware to try to shut down the one I've been working on.

Go and actually do something useful instead of just arguing that "it
must be possible".


	I'm not the person you're replying to, but just a reminder here: there 
is one alternative proposal that already has a fully functioning 
implementation, namely the current behavior.  Your arguments against the 
deferred-evaluation proposal seem to constantly be reiterating that 
there is no concrete deferred-evaluation proposal.  You are right.  But 
your arguments also seem to be insinuating that if there is no such 
proposal, then opposition to the PEP is somehow misguided, and that is 
incorrect.  There doesn't need to be any concrete alternative proposal 
other than "leave everything as it is and wait until we think of 
something better".


	It is perfectly valid to oppose your PEP even on the basis that maybe a 
deferred-evaluation proposal has a remote possibility of being better in 
the future --- because it is perfectly valid to oppose your PEP even if 
such a proposal has NO possibility of being better in the future.  There 
is no urgency or need for the behavior described in your PEP.  I am fine 
with the current behavior of Python in this regard.  It is not necessary 
to provide any alternative proposal, concrete or handwavy, to argue that 
the PEP is a bad idea.  I believe the PEP is a bad idea because the 
current behavior of Python is actually better than what it would be if 
the PEP were adopted.  I believe it is better to wait until we think of 
a better idea than to implement this PEP, and, if we never think of a 
better idea, then never change the existing argument-default behavior of 
Python.


--
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."

   --author unknown
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FTC2N3APMR3YUBJYAOEIWTHRL4HXNTWX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Piotr Duda
wt., 21 cze 2022 o 05:20 Steven D'Aprano  napisał(a):

> The point is, Rob thought (and possibly still does, for all I know) that
> lazy evaluation is completely orthogonal to late-bound defaults. The PEP
> makes that claim too, even though it is not correct. With a couple of
> tweaks that we have to do anyway, and perhaps a change of syntax (and
> maybe not even that!) we can get late-bound defaults *almost* for free
> if we had lazy evaluation.

That depends of lazy evaluation spec, if lazy expression would ever
become a thing in python, it may be defined to have syntax like `lazy
` which would be rough equivalent off `LazyObject(lambda:
)` that would evaluate that lambda at most once, plus some
interpreter tweaks to make LazyObject transparent to python code.

so for this code (`??` replaced with different combination of
early/late and not lazy/lazy)

```
x = []

def f(x, y, z ?? len(x)):
  x.append(y)
  print(z, end = ' ')

x.append(0)
f([1, 2, 3], 4)
x.append(0)
f([1, 2, 3, 4], 4)
```

I expect that
for `??` = `=` I get `0 0 `
for `??` = `=>` I get `3 4 `
for `??` = `= lazy` I get `1 1`
for '??' = `=> lazy` I get `4 5 `

That would be completely orthogonal.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GZ2OCWGI67FTZGTCNHHYAXB7Y54LBTPY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Chris Angelico
On Tue, 21 Jun 2022 at 13:17, Steven D'Aprano  wrote:
>
> On Tue, Jun 21, 2022 at 12:13:08AM +1000, Chris Angelico wrote:
>
> > Nice analogy. It doesn't hold up.
> >
> > Consider this function:
> >
> > def f(stuff, max=>len(stuff)):
> > stuff.append(1)
> > print(max)
> >
> > f([1,2,3])
> >
> > How would you use lazy evaluation to *guarantee* the behaviour here?
>
> By "the behaviour" I presume you want `max` evaluated before the body of
> the function is entered, rather than at its point of use.
>
> Same way your implementation does: ensure that the interpreter
> fully evaluates `max` before entering the body of the function.

YES! Which means that guess what! It's NOT the same as having a
default which is a deferred evaluation object! It would be *buggy
behaviour* if you set the default to be a deferred evaluation object,
and the interpreter evaluated it on entering the body.

> > The only way I can imagine doing it is basically the same as I'm
> > doing: that late-bound argument defaults *have special syntax and
> > meaning to the compiler*. If they were implemented with some sort of
> > lazy evaluation object, they would need (a) access to the execution
> > context, so you can't just use a function;
>
> Obviously you can't just compile the default expression as a function
> *and do nothing else* and have late bound defaults magically appear from
> nowhere.
>
> Comprehensions are implemented as functions. Inside comprehensions, the
> walrus operator binds to the caller's scope, not the comprehension scope.
>
> >>> def frob(items):
> ... thunk = ((w:=len(items)) for x in (None,))
> ... next(thunk)
> ... return ('w' in locals(), w)
> ...
> >>> frob([1, 2, 3, 4, 5])
> (True, 5)
>
> That seems to be exactly the behaviour needed for lazy evaluation
> thunks, except of course we don't need all the other goodies that
> generators provide (e.g. send and throw methods).

GO AND IMPLEMENT IT. I'm done arguing this. Write the code. You'll
find it's a LOT more problematic than you claim.

> One obvious difference is that currently if we moved that comprehension
> into the function signature, it would use the `items` from the
> surrounding scope (because of early binding). It has to be set up in
> such a way that items comes from the correct scope too.
>
> If we were willing to give up fast locals, I think that the normal LEGB
> lookup will do the trick. That works for locals inside classes, so I
> expect it should work here too.

I wouldn't want to give up fast locals.

> > (b) guaranteed evaluation on function entry,
>
> If that's the behaviour that people prefer, sure. Functions would need
> to know which parameters were:
>
> 1. defined with a lazy default;
> 2. and not passed an argument by the caller (i.e. actually using
>the default)
>
> and for that subset of parameters, evaluate them, before entering the
> body of the function. That's kinda what you already do, isn't it?

But what if you wanted the default to actually be a deferred
evaluation object? BAM, buggy behaviour, according to your spec.

> > (c) the ability to put it in the function header.
>
> Well sure. But if we have syntax for a lazily evaluated expression it
> would be an expression, right? So we can put it anywhere an expression
> can go. Like parameter defaults in a function header.

Yes. See? You could do it as a completely separate proposal, like
we've been saying.

> The point is, Rob thought (and possibly still does, for all I know) that
> lazy evaluation is completely orthogonal to late-bound defaults. The PEP
> makes that claim too, even though it is not correct. With a couple of
> tweaks that we have to do anyway, and perhaps a change of syntax (and
> maybe not even that!) we can get late-bound defaults *almost* for free
> if we had lazy evaluation.

GO AND WRITE THE CODE.

> That suggests that the amount of work to get *both* is not that much
> more than the work needed to get just one. Why have a car that only
> drives to the mall on Thursdays when you can get a car that can drive
> anywhere, anytime, and use it to drive to the mall on Thursday as well?

GO AND WRITE THE CODE.

> > Please stop arguing this point. It is a false analogy and until you
> > can demonstrate *with code* that there is value in doing it, it is a
> > massive red herring.
>
> You can make further debate moot at any point by asking Python-Dev for a
> sponsor for your PEP as it stands right now. If you think your PEP is
> as strong as it can possibly be, you should do that.
>
> (You probably want to fix the broken ReST first.)
>
> Chris, you have been involved in the PEP process for long enough, as
> both a participant of discussions and writer of PEPs, that you know damn
> well that there is no requirement that all PEPs must have a working
> implementation before being accepted, let alone being *considered* by
> the community.
>
> Yes, we're all very impressed that you are a competent C programmer who
> can 

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Chris Angelico
On Tue, 21 Jun 2022 at 13:07, Steven D'Aprano  wrote:
>
> On Tue, Jun 21, 2022 at 03:15:32AM +0100, Rob Cliffe wrote:
>
> > Why do people keep obscuring the discussion of a PEP which addresses
> > Problem A by throwing in discussion of the (unrelated) Problem B?
> > (Chris, and I, have stated, ad nauseam, that these *are* unrelated
> > problems.
>
> Chris says:
>
> "Even if Python does later on grow a generalized lazy evaluation
> feature, it will only change the *implementation* of late-bound
> argument defaults, not their specification."
>
> So you are mistaken that they are unrelated.
>

*facepalm*

I'm not offering you a way to put C code in your Python function defaults.

However, there is a large amount of C code in the implementation of
them, at least in my reference implementation.

So I guess the features of late-bound defaults and C code in function
defaults aren't unrelated either, and I should stop working on this
and start working on that.

Seriously? Are you unable to distinguish implementation from
specification? What are you even doing on this mailing list?

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WGGAMLKO2RCNBRCWR3UPDOEY4K22VAWT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread David Mertz, Ph.D.
It seems like this is all an occasion to use itertools.tee() ... But with a
consciousness that implicit caching uses memory.

On Mon, Jun 20, 2022, 11:36 PM Steven D'Aprano  wrote:

> On Sun, Jun 19, 2022 at 01:34:35AM +0100, Rob Cliffe via Python-ideas
> wrote:
>
> > To me, the natural implementation of slicing on a non-reusable iterator
> > (such as a generator) would be that you are not allowed to go backwards
> > or even stand still:
> > mygen[42]
> > mygen[42]
> > ValueError: Element 42 of iterator has already been used
>
> How does a generic iterator, including generators, know whether or not
> item 42 has already been seen?
>
> islice for generators is really just a thin wrapper around an iterator
> that operates something vaguely like this:
>
> for i in range(start):
> next(iterator)  # throw the result away
> for i in range(start, end):
> yield next(iterator)
>
> It doesn't need to keep track of the last index seen, it just blindly
> advances through the iterator, with some short-cuts for the sake of
> efficiency.
>
>
> --
> Steve
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/A74YRQJ4QID72GE5I2A3QKOU6NHLJNCD/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KC7ATLYOWPGCG4N3A65NJ43ZWNLPR4UE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread Steven D'Aprano
On Sun, Jun 19, 2022 at 01:34:35AM +0100, Rob Cliffe via Python-ideas wrote:

> To me, the natural implementation of slicing on a non-reusable iterator 
> (such as a generator) would be that you are not allowed to go backwards 
> or even stand still:
>     mygen[42]
>     mygen[42]
> ValueError: Element 42 of iterator has already been used

How does a generic iterator, including generators, know whether or not 
item 42 has already been seen?

islice for generators is really just a thin wrapper around an iterator 
that operates something vaguely like this:

for i in range(start):
next(iterator)  # throw the result away
for i in range(start, end):
yield next(iterator)

It doesn't need to keep track of the last index seen, it just blindly 
advances through the iterator, with some short-cuts for the sake of 
efficiency.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/A74YRQJ4QID72GE5I2A3QKOU6NHLJNCD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Steven D'Aprano
On Tue, Jun 21, 2022 at 12:13:08AM +1000, Chris Angelico wrote:

> Nice analogy. It doesn't hold up.
> 
> Consider this function:
> 
> def f(stuff, max=>len(stuff)):
> stuff.append(1)
> print(max)
> 
> f([1,2,3])
> 
> How would you use lazy evaluation to *guarantee* the behaviour here?

By "the behaviour" I presume you want `max` evaluated before the body of 
the function is entered, rather than at its point of use.

Same way your implementation does: ensure that the interpreter 
fully evaluates `max` before entering the body of the function.


> The only way I can imagine doing it is basically the same as I'm
> doing: that late-bound argument defaults *have special syntax and
> meaning to the compiler*. If they were implemented with some sort of
> lazy evaluation object, they would need (a) access to the execution
> context, so you can't just use a function; 

Obviously you can't just compile the default expression as a function 
*and do nothing else* and have late bound defaults magically appear from 
nowhere.

Comprehensions are implemented as functions. Inside comprehensions, the 
walrus operator binds to the caller's scope, not the comprehension scope.

>>> def frob(items):
... thunk = ((w:=len(items)) for x in (None,))
... next(thunk)
... return ('w' in locals(), w)
... 
>>> frob([1, 2, 3, 4, 5])
(True, 5)

That seems to be exactly the behaviour needed for lazy evaluation 
thunks, except of course we don't need all the other goodies that 
generators provide (e.g. send and throw methods).

One obvious difference is that currently if we moved that comprehension 
into the function signature, it would use the `items` from the 
surrounding scope (because of early binding). It has to be set up in 
such a way that items comes from the correct scope too.

If we were willing to give up fast locals, I think that the normal LEGB 
lookup will do the trick. That works for locals inside classes, so I 
expect it should work here too.


> (b) guaranteed evaluation on function entry,

If that's the behaviour that people prefer, sure. Functions would need 
to know which parameters were:

1. defined with a lazy default;
2. and not passed an argument by the caller (i.e. actually using 
   the default)

and for that subset of parameters, evaluate them, before entering the 
body of the function. That's kinda what you already do, isn't it?

One interesting feature here is that you don't have to compile the 
default expressions into the body of the function. You can stick them in 
the code object, as distinct, introspectable thunks with a useful repr. 
Potentially, the only extra code that needs go inside the function body 
is a single byte-code to instantiate the late-bound defaults.

Even that might not need to go in the function body, it could be part of 
the CALL_FUNCTION and CALL_FUNCTION_KW op codes (or whatever we use).


> (c) the ability to put it in the function header.

Well sure. But if we have syntax for a lazily evaluated expression it 
would be an expression, right? So we can put it anywhere an expression 
can go. Like parameter defaults in a function header.

The point is, Rob thought (and possibly still does, for all I know) that 
lazy evaluation is completely orthogonal to late-bound defaults. The PEP 
makes that claim too, even though it is not correct. With a couple of 
tweaks that we have to do anyway, and perhaps a change of syntax (and 
maybe not even that!) we can get late-bound defaults *almost* for free 
if we had lazy evaluation.

That suggests that the amount of work to get *both* is not that much 
more than the work needed to get just one. Why have a car that only 
drives to the mall on Thursdays when you can get a car that can drive 
anywhere, anytime, and use it to drive to the mall on Thursday as well?

> Please stop arguing this point. It is a false analogy and until you
> can demonstrate *with code* that there is value in doing it, it is a
> massive red herring.

You can make further debate moot at any point by asking Python-Dev for a 
sponsor for your PEP as it stands right now. If you think your PEP is 
as strong as it can possibly be, you should do that.

(You probably want to fix the broken ReST first.)

Chris, you have been involved in the PEP process for long enough, as 
both a participant of discussions and writer of PEPs, that you know damn 
well that there is no requirement that all PEPs must have a working 
implementation before being accepted, let alone being *considered* by 
the community.

Yes, we're all very impressed that you are a competent C programmer who 
can write an initial implementation of your preferred design. But your 
repeated gate-keeping efforts to shut down debate by wrongly insisting 
that only a working implementation may be discussed is completely out of 
line, and I think you know it.

Being a C programmer with a working knowledge of the CPython internals 
is not, and never has been, a prerequisite for raising

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Steven D'Aprano
On Tue, Jun 21, 2022 at 03:15:32AM +0100, Rob Cliffe wrote:

> Why do people keep obscuring the discussion of a PEP which addresses 
> Problem A by throwing in discussion of the (unrelated) Problem B?
> (Chris, and I, have stated, ad nauseam, that these *are* unrelated 
> problems.

Chris says:

"Even if Python does later on grow a generalized lazy evaluation
feature, it will only change the *implementation* of late-bound
argument defaults, not their specification."

So you are mistaken that they are unrelated.

Chris could end this debate (and start a whole new one!) by going to the 
Python-Dev mailing list and asking for a sponsor, and if he gets one, 
for the Steering Council to make a ruling on the PEP. He doesn't *need* 
consensus on Python-Ideas. (Truth is, we should not expect 100% 
agreement on any new feature.)

But any arguments, questions and criticisms here which aren't resolved 
will just have to be re-hashed when the core devs and the Steering 
Council read the PEP. They can't be swept under the carpet.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2NAYYR4YX33KRFH5NH3RNHXXTNX2OVSS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Rob Cliffe via Python-ideas

On 18/06/2022 15:51, Stephen J. Turnbull wrote:

  > This raises another choice: should lazy defaults be evaluated before
  > entering the body of the function, or at the point where the parameter
  > is used? Which would be more useful?



Both are potentially useful.
Yes, both *ARE* potentially useful. *ABSOLUTELY*.  I don't think anyone 
would deny that.  Certainly not I.
Let's call "I want late-bound defaults and Python doesn't have them" 
Problem A.
Let's call "I want my default value not to be evaluated until needed" 
Problem B.
(Of course, you may not consider either to be a problem, but let's 
assume that you think that at least one of them is.)

Chris is offering a PEP and an implementation which addresses Problem A.
If Problem B is a problem for you, you can (currently) use a sentinel 
value, and explicitly evaluate the default when you want it.  Not so 
terrible.  In fact arguably, more often than not, better, because it's 
explicit.  And more flexible (you can evaluate a different expression, 
or in a different scope, in different places in the function body).  
Chris/PEP 671 is not attempting to provide a better way of doing that.  
He is not offering Deferred Evaluation Objects (DEOs).  Maybe in 5 years 
or so someone will  offer an implementation of that, and everyone can be 
happy. 😁
Meanwhile, *Chris is offering a solution of Problem A*.  He is *NOT 
addressing Problem B* - someone else is welcome to try that.
Let's be honest: PEP 671 does not allow you to do anything you can't 
already.  What is adds is some more convenience, some more concision, 
and arguably (and I *would* argue, in appropriate cases) some more 
readability.


Why do people keep obscuring the discussion of a PEP which addresses 
Problem A by throwing in discussion of the (unrelated) Problem B?
(Chris, and I, have stated, ad nauseam, that these *are* unrelated 
problems.  If you don't agree, I can only ask you to consider the 
implementations necessary to solve each.  If that doesn't change your 
mind, I have to throw my hands in the air and say "We'll have to agree 
to differ".)


*Surely solving one "problem" is better than dithering about which 
"problem" to solve.*


I've been accused of trying to censor this thread, but really - I'm just 
frustrated when people are invited to comment on PEP 671, and they don't 
comment on PEP 671, but on something else.


BTW Thank you Stephen Turnbull, for your measured comments to this thread.
Best wishes
Rob Cliffe___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GOI65D2YBUYIYOXX2QK3MTRLI2VOBG3R/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread Chris Angelico
On Tue, 21 Jun 2022 at 11:07, Rob Cliffe via Python-ideas
 wrote:
>
>
>
> On 20/06/2022 17:39, Jeremiah Paige wrote:
>
> On Sat, Jun 18, 2022 at 5:42 PM Rob Cliffe via Python-ideas 
>  wrote:
>>
>> To me, the natural implementation of slicing on a non-reusable iterator
>> (such as a generator) would be that you are not allowed to go backwards
>> or even stand still:
>>  mygen[42]
>>  mygen[42]
>> ValueError: Element 42 of iterator has already been used
>
>
> I agree that indexing an iterator such that it could only go forward feels 
> like a reasonable and useful feature in python, but I disagree about the 
> ValueError. To me the above produces two values: the 43rd and 85th elements 
> produced by mygen. Anything else is a bizarre error waiting to arise at 
> obscure times. What if this iterator is passed to another function? Used in a 
> loop? Now this information about what index has been used has to be carried 
> around and checked on every access.
>
> Oh, OK, I have no problem with that (except shouldn't it be the 43rd and 86th 
> elements?).  I guess which interpretation is more useful depends on the use 
> case.

I think this confusion is exactly why arbitrary iterators shouldn't be
indexable like this. Slicing them is a maybe, but even there, it's
hard to explain that mygen[3..] is a destructive operation on mygen
(rather than, as it is with sequences, a copy). It wouldn't be hard to
create a "lazy caching sequence-like view" to an iterable, which would
never reset its base index, but within the iterator itself, it's
inevitably going to cause a lot of confusion.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OZJBPHBFVZUHXGC7MSYJYEJR44GC5EKC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread Rob Cliffe via Python-ideas



On 20/06/2022 17:39, Jeremiah Paige wrote:
On Sat, Jun 18, 2022 at 5:42 PM Rob Cliffe via Python-ideas 
 wrote:


To me, the natural implementation of slicing on a non-reusable
iterator
(such as a generator) would be that you are not allowed to go
backwards
or even stand still:
 mygen[42]
 mygen[42]
ValueError: Element 42 of iterator has already been used


I agree that indexing an iterator such that it could only go forward 
feels like a reasonable and useful feature in python, but I disagree 
about the ValueError. To me the above produces two values: the 43rd 
and 85th elements produced by mygen. Anything else is a bizarre error 
waiting to arise at obscure times. What if this iterator is passed to 
another function? Used in a loop? Now this information about what 
index has been used has to be carried around and checked on every access.
Oh, OK, I have no problem with that (except shouldn't it be the 43rd and 
86th elements?).  I guess which interpretation is more useful depends on 
the use case.

Best wishes
Rob Cliffe


Regards,
Jeremiah
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3EU3AEPFV6BLGMNUQ25VK5IJEHUIIYWT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Chris Angelico
On Tue, 21 Jun 2022 at 10:13, Rob Cliffe via Python-ideas
 wrote:
>
> On 19/06/2022 04:42, David Mertz, Ph.D. wrote:
>
> On Sat, Jun 18, 2022, 9:21 PM Rob Cliffe
>>
>> Sorry again, but IMO discussing any model except one where late-bound 
>> defaults are evaluated at function call time is just adding FUD.
>
>
> It's definitely rude to repeatedly state that anyone who's opinion is 
> different from yours is "adding FUD" and doesn't belong in the thread.
>
> I was not talking about people whose opinion was different from mine.  I was 
> talking about people who obscured the discussion of a proposal by talking 
> about a different proposal.  And that, IMO, would be rude if it were done 
> deliberately, though I accept that it wasn't.
>
>
> The topic of "late binding in function signatures"  simply isn't *orthogonal* 
> to "late binding in the general sense." Yes, they are distinct, but very 
> closely adjacent.
>
> We disagree about that.  Please consider the *IMPLEMENTATIONS* of each.  I 
> respectfully suggest that you may conclude that they are not so close after 
> all.
>
> PS  In my support may I quote from a post from Chris:
>
> [Steven D'Aprano] Chris may choose to reject this generalised lazy evaluation 
> idea, but if
> so it needs to go into a Rejected Ideas section. Or he may decide that
> actually having a generalised lazy evaluation idea is *brilliant* and
> much nicer than making defaults a special case.
>
> [Chris] It's an almost completely orthogonal proposal. I used to have a
> reference to it in the PEP but removed it because it was unhelpful.
>

Since it appears to matter to people, I've readded a mention of it.
It's just freshly pushed so you might not see it instantly, but within
a few minutes (or browse the source code on GitHub), you should see
deferred evaluation mentioned in PEP 671.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OVN6E37EB73TZIW7EPKJWFDAGR47CQO2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Rob Cliffe via Python-ideas

On 19/06/2022 04:42, David Mertz, Ph.D. wrote:

On Sat, Jun 18, 2022, 9:21 PM Rob Cliffe

Sorry again, but IMO discussing any model except one where
late-bound defaults are evaluated at function call time is just
adding FUD.


It's definitely rude to repeatedly state that anyone who's opinion is 
different from yours is "adding FUD" and doesn't belong in the thread.
I was not talking about people whose opinion was different from mine.  I 
was talking about people who obscured the discussion of a proposal by 
talking about a different proposal.  And that, IMO, would be rude if it 
were done deliberately, though I accept that it wasn't.



The topic of "late binding in function signatures"  simply isn't 
*orthogonal* to "late binding in the general sense." Yes, they are 
distinct, but very closely adjacent.
We disagree about that. *Please consider the */_**IMPLEMENTATIONS**_/*of 
each.  I respectfully suggest that you may conclude that they are not so 
close after all.


*PS  In my support may I quote from a post from Chris:

[Steven D'Aprano] Chris may choose to reject this generalised lazy 
evaluation idea, but if

so it needs to go into a Rejected Ideas section. Or he may decide that
actually having a generalised lazy evaluation idea is *brilliant* and
much nicer than making defaults a special case.

[Chris] It's an almost completely orthogonal proposal. I used to have a
reference to it in the PEP but removed it because it was unhelpful.


Rob Cliffe___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LZVHCT42ISSJSOHTC6KAENAZJD33C4ZT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Rob Cliffe via Python-ideas

On 19/06/2022 04:42, David Mertz, Ph.D. wrote:

On Sat, Jun 18, 2022, 9:21 PM Rob Cliffe

Sorry again, but IMO discussing any model except one where
late-bound defaults are evaluated at function call time is just
adding FUD.


It's definitely rude to repeatedly state that anyone who's opinion is 
different from yours is "adding FUD" and doesn't belong in the thread.
I was not talking about people whose opinion was different from mine.  I 
was talking about people who obscured the discussion of a proposal by 
talking about a different proposal.  And that, IMO, would be rude if it 
were done deliberately, though I accept that it wasn't.


Stephen, and Steven, and Paul, and I all perfectly well understand 
what "evaluated at function call time" means.
I should jolly well hope so too.  I certainly did not intend to suggest 
that any of you or anyone else do not understand it.  And I can't see 
anything in any of my posts that suggests that I did intend that.  Do 
you think that I did?  If so, why?  (Please quote where appropriate.)  
If I did somehow suggest that, I sincerely apologise.


It's a way to spell `if arg is sentinel: arg = ...` using slightly 
fewer characters, and moving an expression from the body to the signature.
Yes, if you want to simplify a bit, basically it is.  But it avoids the 
trap of the sentinel value being a possible parameter value. And it 
would answer a number of Stack Overflow posts on the lines of "Why 
doesn't this work [as I expected]?"  I don't think that anyone, 
including Chris, would say that it allows you to do something that you 
can't do already (though I might be wrong, but I believe Python is 
already Turing-complete 😁).  The virtue of the PEP is that it adds some 
convenience and some clarity and some concision. (Concision *is* a 
virtue, ceteribus paribus - which often they are not.)


I'm still -1 because I don't think the purpose alone is close to worth 
the cost of new syntax... And especially not using sigils that are 
confusing to read in code.
You complain about sigils.  Do you accept my point that more **words** 
(and words that can, perhaps a trifle unkindly, be classed as 
boilerplate rather than genuine content) can also make stuff harder to read?


The topic of "late binding in function signatures"  simply isn't 
*orthogonal* to "late binding in the general sense." Yes, they are 
distinct, but very closely adjacent.
We disagree about that. *Please consider the */_**IMPLEMENTATIONS**_/*of 
each.  I respectfully suggest that you may conclude that they are not so 
close after all.*

Best wishes
Rob Cliffe___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G734UNUDJLGFV362LDE4WIUI7AL4DWEA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Mike Miller



On 2022-06-20 03:34, Paul Moore wrote:

For the record, I think the islice solution is sufficient for this
case. But I have needed this sort of thing occasionally, and islice



The post above sums it up for me.  We have next() for one to a few, islice for 
several to zillions, and a for-enumerate-break also for several to zillions. 
Cases handled, with or without import.


The parameter to next() sounds like a reasonable thing to add however, doesn't 
seem like it would hurt anything but the use of islice.


If any syntax is chosen, hope it won't include "*" as that definitely says 
"unpack" to me, as that's what I say when reading it (without a space afterward).


-Mike
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7JFQJHCUCSOA7UCJJJLDP6YHXVQMKCIS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Paul Moore
On Mon, 20 Jun 2022 at 18:42, Steve Jorgensen  wrote:
>
> Steven D'Aprano wrote:
> > Okay, I'm convinced.
> > If we need this feature (and I'm not convinced about that part), then it
> > makes sense to keep the star and write it as `spam, eggs, *... = items`.
>
> I thought about that, but to me, there are several reasons to not do that and 
> to have the ellipsis mean multiple rather than prepending * for that:
> 1. In common usage outside of programming, the ellipsis means a continuation 
> and not just a single additional thing.
> 2. Having `*...` mean any number of things implies that `...` means a single 
> thing, and I don't think there is a reason to match 1 thing but not assign it 
> to a variable. It is also already fine to repeat `_` in the left side 
> expression.
> 3. I am guessing (though I could be wrong) that support for `*...` would be a 
> bigger change and more complicated in the Python source code.

Also, while I can't speak for others, I found that when writing
examples for posts here, the "*" in "*..." has too strong of a
connection with "consume", and I *still* naturally read *... as
"consume the rest" (even though it's not currently valid syntax, and
the rules for what it *does* mean would be clear and unambiguous, etc
etc). So for me at least, any syntax that uses a * would be too easy
to misread.

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EMZZUSIJYNGV4KM75MF66UPHF3K6DOH7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add a line_offsets() method to str

2022-06-20 Thread Steve Jorgensen
Steve Jorgensen wrote:
> Jonathan Slenders wrote:
> > Hi everyone,
> > Today was the 3rd time I came across a situation where it was needed to
> > retrieve all the positions of the line endings (or beginnings) in a very
> > long python string as efficiently as possible. First time, it was needed in
> > prompt_toolkit, where I spent a crazy amount of time looking for the most
> > performant solution. Second time was in a commercial project where
> > performance was very critical too. Third time is for the Rich/Textual
> > project from Will McGugan. (See:
> > https://twitter.com/willmcgugan/status/1537782771137011715 )
> > The problem is that the `str` type doesn't expose any API to efficiently
> > find all \n positions. Every Python implementation is either calling
> > `.index()` in a loop and collecting the results or running a regex over the
> > string and collecting all positions.
> > For long strings, depending on the implementation, this results in a lot of
> > overhead due to either:
> > 
> > calling Python functions (or any other Python instruction) for every \n
> > 
> > character in the input. The amount of executed Python instructions is O(n)
> > here.
> > 
> > Copying string data into new strings.
> > 
> > The fastest solution I've been using for some time, does this (simplified):
> > `accumulate(chain([0], map(len, text.splitlines(True`. The performance
> > is great here, because the amount of Python instructions is O(1).
> > Everything is chained in C-code thanks to itertools. Because of that, it
> > can outperform the regex solution with a factor of ~2.5. (Regex isn't slow,
> > but iterating over the results is.)
> > The bad things about this solution is however:
> > 
> > Very cumbersome syntax.
> > We call `splitlines()` which internally allocates a huge amount of
> > 
> > strings, only to use their lengths. That is still much more overhead then a
> > simple for-loop in C would be.
> > Performance matters here, because for these kind of problems, the list of
> > integers that gets produced is typically used as an index to quickly find
> > character offsets in the original string, depending on which line is
> > displayed/processed. The bisect library helps too to quickly convert any
> > index position of that string into a line number. The point is, that for
> > big inputs, the amount of Python instructions executed is not O(n), but
> > O(1). Of course, some of the C code remains O(n).
> > So, my ask here.
> > Would it make sense to add a `line_offsets()` method to `str`?
> > Or even `character_offsets(character)` if we want to do that for any
> > character?
> > Or `indexes(...)/indices(...)` if we would allow substrings of arbitrary
> > lengths?
> > Thanks,
> > Jonathan
> > I presume there is some reason that `re.findall` did not work or was not 
> > optimal?

I just saw your reply elsewhere in the conversation that says

> That requires a more complex regex pattern. I was actually using:
> re.compile(r"\n|\r(?!\n)")
> And then the regex becomes significantly slower than the splitlines() 
> solution, which is still much slower than it has to be.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JGY2YNOCKZ2KS7BMQMNCEY3YHIRJC3UL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add a line_offsets() method to str

2022-06-20 Thread Steve Jorgensen
Jonathan Slenders wrote:
> Hi everyone,
> Today was the 3rd time I came across a situation where it was needed to
> retrieve all the positions of the line endings (or beginnings) in a very
> long python string as efficiently as possible. First time, it was needed in
> prompt_toolkit, where I spent a crazy amount of time looking for the most
> performant solution. Second time was in a commercial project where
> performance was very critical too. Third time is for the Rich/Textual
> project from Will McGugan. (See:
> https://twitter.com/willmcgugan/status/1537782771137011715 )
> The problem is that the `str` type doesn't expose any API to efficiently
> find all \n positions. Every Python implementation is either calling
> `.index()` in a loop and collecting the results or running a regex over the
> string and collecting all positions.
> For long strings, depending on the implementation, this results in a lot of
> overhead due to either:
> - calling Python functions (or any other Python instruction) for every \n
> character in the input. The amount of executed Python instructions is O(n)
> here.
> - Copying string data into new strings.
> The fastest solution I've been using for some time, does this (simplified):
> `accumulate(chain([0], map(len, text.splitlines(True`. The performance
> is great here, because the amount of Python instructions is O(1).
> Everything is chained in C-code thanks to itertools. Because of that, it
> can outperform the regex solution with a factor of ~2.5. (Regex isn't slow,
> but iterating over the results is.)
> The bad things about this solution is however:
> - Very cumbersome syntax.
> - We call `splitlines()` which internally allocates a huge amount of
> strings, only to use their lengths. That is still much more overhead then a
> simple for-loop in C would be.
> Performance matters here, because for these kind of problems, the list of
> integers that gets produced is typically used as an index to quickly find
> character offsets in the original string, depending on which line is
> displayed/processed. The bisect library helps too to quickly convert any
> index position of that string into a line number. The point is, that for
> big inputs, the amount of Python instructions executed is not O(n), but
> O(1). Of course, some of the C code remains O(n).
> So, my ask here.
> Would it make sense to add a `line_offsets()` method to `str`?
> Or even `character_offsets(character)` if we want to do that for any
> character?
> Or `indexes(...)/indices(...)` if we would allow substrings of arbitrary
> lengths?
> Thanks,
> Jonathan

I presume there is some reason that `re.findall` did not work or was not 
optimal?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PO3V3XXHZL7CF4YCD635AF57OYG2RORC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Steve Jorgensen
Steven D'Aprano wrote:
> Okay, I'm convinced.
> If we need this feature (and I'm not convinced about that part), then it 
> makes sense to keep the star and write it as `spam, eggs, *... = items`.

I thought about that, but to me, there are several reasons to not do that and 
to have the ellipsis mean multiple rather than prepending * for that:
1. In common usage outside of programming, the ellipsis means a continuation 
and not just a single additional thing.
2. Having `*...` mean any number of things implies that `...` means a single 
thing, and I don't think there is a reason to match 1 thing but not assign it 
to a variable. It is also already fine to repeat `_` in the left side 
expression.
3. I am guessing (though I could be wrong) that support for `*...` would be a 
bigger change and more complicated in the Python source code.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2YGDMCGY5NBMIO57F6M7K3HP6HRYKTWZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Chris Angelico
On Tue, 21 Jun 2022 at 01:44, MRAB  wrote:
>
> On 2022-06-20 15:05, Chris Angelico wrote:
> > On Mon, 20 Jun 2022 at 21:11, Jonathan Fine  wrote:
> >>
> >> Hi
> >>
> >> Some have liked adding a new syntax
> >> a, b, ... = iterable
> >> to mean consume two items from the iterable. However,
> >>a, b, Ellipsis = iterable
> >> has a different meaning (at least in Python 3.8). It redefines Ellipsis. 
> >> (As an explicit constant, '...' can be redefined.)
> >
> > To clarify: The syntactic token '...' will always refer to the special
> > object Ellipsis (at least back as far as Python 3.4 - can't remember
> > when it became available in all contexts), but the name Ellipsis can
> > be rebound. So even though, in many contexts, "x = Ellipsis" and "x =
> > ..." will have the same net effect, they are distinct (one is a name
> > lookup and the other is a constant), and they're definitely different
> > in assignment.
> >
> > (Though it wouldn't surprise me if a future Python release adds
> > Ellipsis to the set of non-assignable names, with None/True/False.)
> >
> >> The syntax
> >>   a, b, ... = iterable
> >> so to speak fills a gap in existing syntax, as the construct is at present 
> >> invalid. I actually like gaps in syntax, for the same reason that I like a 
> >> central reservation in a highway. The same goes for the hard shoulder / 
> >> breakdown lane.
> >>
> >> The title of this thread includes the phrase 'Stop Iterating' (capitals 
> >> added). This suggests the syntax
> >>   a, b, StopIterating = iterable
> >> where StopIterating is a new keyword that can be used only in this context.
> >>
> >> I'd like to know what others think about this suggestion.
> >>
> >
> > Hard no. That is currently-legal syntax, and it's also clunky. I'd
> > much rather the "assign to ..." notation than a weird new soft keyword
> > that people are going to think is a typo for StopIteration.
> >
> > It's worth noting that the proposed syntax has a slight distinction
> > from the normal asterisk notation, in that it makes perfect sense to
> > write this:
> >
> > a, *_, b = thing
> >
> > but does not make sense to write this:
> >
> > a, ..., b = thing
> >
> > as the "don't iterate over this thing" concept doesn't work here.
> > (Supporting this would require some way to reverse the iterator, and
> > that's not a language guarantee.)
> >
> It could be taken to mean "consume but discard", leaving 'a' bound to
> the first item and 'b' bound to the last item, but then:
>
> a, ... = thing
>
> would have to leave 'a' bound to the first item and the iterator exhausted.
>
> In fact, use of ... would always have to exhaust the iterator, which, I
> think, would not be very useful.
>
> Best not to go that way.

Yeah. "Consume but discard" is spelled *_, so we don't need this. The
whole point of this is to NOT consume it.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NJFEUDVYWP5QIAB4WS7P6IWJDW7TEBM2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread Jeremiah Paige
On Sat, Jun 18, 2022 at 5:42 PM Rob Cliffe via Python-ideas <
python-ideas@python.org> wrote:

> To me, the natural implementation of slicing on a non-reusable iterator
> (such as a generator) would be that you are not allowed to go backwards
> or even stand still:
>  mygen[42]
>  mygen[42]
> ValueError: Element 42 of iterator has already been used


I agree that indexing an iterator such that it could only go forward feels
like a reasonable and useful feature in python, but I disagree about the
ValueError. To me the above produces two values: the 43rd and 85th elements
produced by mygen. Anything else is a bizarre error waiting to arise at
obscure times. What if this iterator is passed to another function? Used in
a loop? Now this information about what index has been used has to be
carried around and checked on every access.

Regards,
Jeremiah
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/K2CBKLZT76AS6MFYBMU3YCP4BP26P2IP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread Mathew Elman
Chris Angelico wrote:
> On Mon, 20 Jun 2022 at 21:11, Mathew Elman mathew.el...@ocado.com wrote:
> > This I like - it seems very intuitive, almost like an irreversible io 
> > stream.
> > I don't know if there would be cases where this would lead to unexpected 
> > bugs, but without looking into it it seems nice.
> > Question: What would be the natural behaviour for negative indices? Raising 
> > an error?
> > Please quote the person and text that you're responding to (and then
> add your response underneath). Otherwise we have to guess which
> (sub)proposal it is that you like.
> ChrisA

Oops, I thought I had, it was this:

> To me, the natural implementation of slicing on a non-reusable iterator (such 
> as a generator) would be that you are not allowed to go backwards or even 
> stand still:
> mygen[42]
> mygen[42]
> ValueError: Element 42 of iterator has already been used (Apologies if I 
> don't know the difference between an iterator and an iterable; y'all know 
> what I mean.)
> You still get a useful feature that you didn't have before. Expecting a 
> generator (or whatever) to cache some its values in case you wanted a slice 
> of them opens up a huge can of worms and is surely best forgotten.  (100Gb 
> generator anyone?)  Well, maybe caching ONE value (the last one accessed) is 
> reasonable, so you could stand still but not go backwards.  But it's still 
> adding overhead.
> Best wishes
> Rob Cliffe
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LL5FBAT53QJRKGLQEY3H3GD6JGXTB4QE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add a line_offsets() method to str

2022-06-20 Thread MRAB

On 2022-06-20 16:12, Christopher Barker wrote:
Hmm - I’m a bit confused about how you handle mixed / multiple line 
endings. If you use splitlines(), then it will remove the line endings, 
so if there are two-char line endings, then you’ll get off by one 
errors, yes?


I would think you could look for “\n”, and get the correct answer ( with 
extraneous “\r”s in the substrings…


-CHB

How about something like .split, but returning the spans instead of the 
strings?


On Mon, Jun 20, 2022 at 5:04 PM Christopher Barker > wrote:


If you are working with bytes, then numpy could be perfect— not a
small dependency of course, but it should work, and work fast.

And a cython method would be quite easy to write, but of course
substantially harder to distribute :-(

-CHB

On Sun, Jun 19, 2022 at 5:30 PM Jonathan Slenders
mailto:jonat...@slenders.be>> wrote:

Thanks all for all the responses! That's quite a bit to think about.

A couple of thoughts:

1. First, I do support a transition to UTF-8, so I understand we
don't want to add more methods that deal with character offsets.
(I'm familiar with how strings work in Rust.) However, does that
mean we won't be using/exposing any offset at all, or will it
become possible to slice using byte offsets?

2. The commercial application I mentioned where this is critical
is actually using bytes instead of str. Sorry for not mentioning
earlier. We were doing the following:
     list(accumulate(chain([0], map(len, text.splitlines(True)
where text is a bytes object. This is significantly faster than
a binary regex for finding all universal line endings. This
application is an asyncio web app that streams Cisco show-tech
files (often several gigabytes) from a file server over HTTP;
stores them chunk by chunk into a local cache file on disk; and
builds a index of byte offsets in the meantime by running the
above expression over every chunk. That way the client web app
can quickly load the lines from disk as the user scrolls through
the file. A very niche application indeed, so use of Cython
would be acceptable in this particular case. I published the
relevant snippet here to be studied:

https://gist.github.com/jonathanslenders/59ddf8fe2a0954c7f1865fba3b151868


It does handle an interesting edge case regarding UTF-16.

3. The code in prompt_toolkit can be found here:

https://github.com/prompt-toolkit/python-prompt-toolkit/blob/master/src/prompt_toolkit/document.py#L209


(It's not yet using 'accumulate' there, but for the rest it's
the same.) Also here, universal line endings support is
important, because the editing buffer can in theory contain a
mix of line endings. It has to be performant, because it
executes on every key stroke. In this case, a more complex data
structure could probably solve performance issues here, but it's
really not worth the complexity that it introduces in every text
manipulation (like every key binding). Also try using the "re"
library to search over a list of lines or anything that's not a
simple string.

4. I tested on 3.11.0b3. Using the splitlines() approach is
still 2.5 times faster than re. Imagine if splitlines() doesn't
have to do the work to actually create the substrings, but only
has to return the offsets, that should be even much faster and
not require so much memory. (I have an benchmark that does it
one chunk at a time, to prevent using too much memory:

https://gist.github.com/jonathanslenders/bfca8e4f318ca64e718b4085a737accf


)

So talking about bytes. Would it be acceptable to have a
`bytes.line_offsets()` method instead? Or
`bytes.splitlines(return_offsets=True)`? Because byte offsets
are okay, or not? `str.splitlines(return_offsets=True)` would be
very nice, but I understand the concerns.

It's somewhat frustrating here knowing that for `splitlines()`,
the information is there, already computed, just not immediately
accessible. (without having Python do lots of unnecessary work.)

Jonathan


Le dim. 19 juin 2022 à 15:34, Jonathan Fine mailto:jfine2...@gmail.com>> a écrit :

Hi

This is a nice problem, well presented. Here's four comments
/ questions.

1. How does the introduction of faster CPython in Python
3.11 af

[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread MRAB

On 2022-06-20 15:05, Chris Angelico wrote:

On Mon, 20 Jun 2022 at 21:11, Jonathan Fine  wrote:


Hi

Some have liked adding a new syntax
a, b, ... = iterable
to mean consume two items from the iterable. However,
   a, b, Ellipsis = iterable
has a different meaning (at least in Python 3.8). It redefines Ellipsis. (As an 
explicit constant, '...' can be redefined.)


To clarify: The syntactic token '...' will always refer to the special
object Ellipsis (at least back as far as Python 3.4 - can't remember
when it became available in all contexts), but the name Ellipsis can
be rebound. So even though, in many contexts, "x = Ellipsis" and "x =
..." will have the same net effect, they are distinct (one is a name
lookup and the other is a constant), and they're definitely different
in assignment.

(Though it wouldn't surprise me if a future Python release adds
Ellipsis to the set of non-assignable names, with None/True/False.)


The syntax
  a, b, ... = iterable
so to speak fills a gap in existing syntax, as the construct is at present 
invalid. I actually like gaps in syntax, for the same reason that I like a 
central reservation in a highway. The same goes for the hard shoulder / 
breakdown lane.

The title of this thread includes the phrase 'Stop Iterating' (capitals added). 
This suggests the syntax
  a, b, StopIterating = iterable
where StopIterating is a new keyword that can be used only in this context.

I'd like to know what others think about this suggestion.



Hard no. That is currently-legal syntax, and it's also clunky. I'd
much rather the "assign to ..." notation than a weird new soft keyword
that people are going to think is a typo for StopIteration.

It's worth noting that the proposed syntax has a slight distinction
from the normal asterisk notation, in that it makes perfect sense to
write this:

a, *_, b = thing

but does not make sense to write this:

a, ..., b = thing

as the "don't iterate over this thing" concept doesn't work here.
(Supporting this would require some way to reverse the iterator, and
that's not a language guarantee.)

It could be taken to mean "consume but discard", leaving 'a' bound to 
the first item and 'b' bound to the last item, but then:


a, ... = thing

would have to leave 'a' bound to the first item and the iterator exhausted.

In fact, use of ... would always have to exhaust the iterator, which, I 
think, would not be very useful.


Best not to go that way.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5NXRQSQVZWLIOU64E3CHH4H57VATAOLU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add a line_offsets() method to str

2022-06-20 Thread Christopher Barker
Hmm - I’m a bit confused about how you handle mixed / multiple line
endings. If you use splitlines(), then it will remove the line endings, so
if there are two-char line endings, then you’ll get off by one errors, yes?

I would think you could look for “\n”, and get the correct answer ( with
extraneous “\r”s in the substrings…

-CHB

On Mon, Jun 20, 2022 at 5:04 PM Christopher Barker 
wrote:

> If you are working with bytes, then numpy could be perfect— not a small
> dependency of course, but it should work, and work fast.
>
> And a cython method would be quite easy to write, but of course
> substantially harder to distribute :-(
>
> -CHB
>
> On Sun, Jun 19, 2022 at 5:30 PM Jonathan Slenders 
> wrote:
>
>> Thanks all for all the responses! That's quite a bit to think about.
>>
>> A couple of thoughts:
>>
>> 1. First, I do support a transition to UTF-8, so I understand we don't
>> want to add more methods that deal with character offsets. (I'm familiar
>> with how strings work in Rust.) However, does that mean we won't be
>> using/exposing any offset at all, or will it become possible to slice using
>> byte offsets?
>>
>> 2. The commercial application I mentioned where this is critical is
>> actually using bytes instead of str. Sorry for not mentioning earlier. We
>> were doing the following:
>> list(accumulate(chain([0], map(len, text.splitlines(True)
>> where text is a bytes object. This is significantly faster than a binary
>> regex for finding all universal line endings. This application is an
>> asyncio web app that streams Cisco show-tech files (often several
>> gigabytes) from a file server over HTTP; stores them chunk by chunk into a
>> local cache file on disk; and builds a index of byte offsets in the
>> meantime by running the above expression over every chunk. That way the
>> client web app can quickly load the lines from disk as the user scrolls
>> through the file. A very niche application indeed, so use of Cython would
>> be acceptable in this particular case. I published the relevant snippet
>> here to be studied:
>> https://gist.github.com/jonathanslenders/59ddf8fe2a0954c7f1865fba3b151868
>> It does handle an interesting edge case regarding UTF-16.
>>
>> 3. The code in prompt_toolkit can be found here:
>> https://github.com/prompt-toolkit/python-prompt-toolkit/blob/master/src/prompt_toolkit/document.py#L209
>> (It's not yet using 'accumulate' there, but for the rest it's the same.)
>> Also here, universal line endings support is important, because the editing
>> buffer can in theory contain a mix of line endings. It has to be
>> performant, because it executes on every key stroke. In this case, a more
>> complex data structure could probably solve performance issues here, but
>> it's really not worth the complexity that it introduces in every text
>> manipulation (like every key binding). Also try using the "re" library to
>> search over a list of lines or anything that's not a simple string.
>>
>> 4. I tested on 3.11.0b3. Using the splitlines() approach is still 2.5
>> times faster than re. Imagine if splitlines() doesn't have to do the work
>> to actually create the substrings, but only has to return the offsets, that
>> should be even much faster and not require so much memory. (I have an
>> benchmark that does it one chunk at a time, to prevent using too much
>> memory:
>> https://gist.github.com/jonathanslenders/bfca8e4f318ca64e718b4085a737accf
>> )
>>
>> So talking about bytes. Would it be acceptable to have a
>> `bytes.line_offsets()` method instead? Or
>> `bytes.splitlines(return_offsets=True)`? Because byte offsets are okay, or
>> not? `str.splitlines(return_offsets=True)` would be very nice, but I
>> understand the concerns.
>>
>> It's somewhat frustrating here knowing that for `splitlines()`, the
>> information is there, already computed, just not immediately accessible.
>> (without having Python do lots of unnecessary work.)
>>
>> Jonathan
>>
>>
>> Le dim. 19 juin 2022 à 15:34, Jonathan Fine  a
>> écrit :
>>
>>> Hi
>>>
>>> This is a nice problem, well presented. Here's four comments / questions.
>>>
>>> 1. How does the introduction of faster CPython in Python 3.11 affect the
>>> benchmarks?
>>> 2. Is there an across-the-board change that would speedup this
>>> line-offsets task?
>>> 3. To limit splitlines memory use (at small performance cost), chunk the
>>> input string into say 4 kb blocks.
>>> 4. Perhaps anything done here for strings should also be done for bytes.
>>>
>>> --
>>> Jonathan
>>> ___
>>> Python-ideas mailing list -- python-ideas@python.org
>>> To unsubscribe send an email to python-ideas-le...@python.org
>>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>>> Message archived at
>>> https://mail.python.org/archives/list/python-ideas@python.org/message/AETGT5HDF3QOFODOWKB4X45ZE4CZ7Y3M/
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>> ___

[Python-ideas] Re: Add a line_offsets() method to str

2022-06-20 Thread Christopher Barker
If you are working with bytes, then numpy could be perfect— not a small
dependency of course, but it should work, and work fast.

And a cython method would be quite easy to write, but of course
substantially harder to distribute :-(

-CHB

On Sun, Jun 19, 2022 at 5:30 PM Jonathan Slenders 
wrote:

> Thanks all for all the responses! That's quite a bit to think about.
>
> A couple of thoughts:
>
> 1. First, I do support a transition to UTF-8, so I understand we don't
> want to add more methods that deal with character offsets. (I'm familiar
> with how strings work in Rust.) However, does that mean we won't be
> using/exposing any offset at all, or will it become possible to slice using
> byte offsets?
>
> 2. The commercial application I mentioned where this is critical is
> actually using bytes instead of str. Sorry for not mentioning earlier. We
> were doing the following:
> list(accumulate(chain([0], map(len, text.splitlines(True)
> where text is a bytes object. This is significantly faster than a binary
> regex for finding all universal line endings. This application is an
> asyncio web app that streams Cisco show-tech files (often several
> gigabytes) from a file server over HTTP; stores them chunk by chunk into a
> local cache file on disk; and builds a index of byte offsets in the
> meantime by running the above expression over every chunk. That way the
> client web app can quickly load the lines from disk as the user scrolls
> through the file. A very niche application indeed, so use of Cython would
> be acceptable in this particular case. I published the relevant snippet
> here to be studied:
> https://gist.github.com/jonathanslenders/59ddf8fe2a0954c7f1865fba3b151868
> It does handle an interesting edge case regarding UTF-16.
>
> 3. The code in prompt_toolkit can be found here:
> https://github.com/prompt-toolkit/python-prompt-toolkit/blob/master/src/prompt_toolkit/document.py#L209
> (It's not yet using 'accumulate' there, but for the rest it's the same.)
> Also here, universal line endings support is important, because the editing
> buffer can in theory contain a mix of line endings. It has to be
> performant, because it executes on every key stroke. In this case, a more
> complex data structure could probably solve performance issues here, but
> it's really not worth the complexity that it introduces in every text
> manipulation (like every key binding). Also try using the "re" library to
> search over a list of lines or anything that's not a simple string.
>
> 4. I tested on 3.11.0b3. Using the splitlines() approach is still 2.5
> times faster than re. Imagine if splitlines() doesn't have to do the work
> to actually create the substrings, but only has to return the offsets, that
> should be even much faster and not require so much memory. (I have an
> benchmark that does it one chunk at a time, to prevent using too much
> memory:
> https://gist.github.com/jonathanslenders/bfca8e4f318ca64e718b4085a737accf
> )
>
> So talking about bytes. Would it be acceptable to have a
> `bytes.line_offsets()` method instead? Or
> `bytes.splitlines(return_offsets=True)`? Because byte offsets are okay, or
> not? `str.splitlines(return_offsets=True)` would be very nice, but I
> understand the concerns.
>
> It's somewhat frustrating here knowing that for `splitlines()`, the
> information is there, already computed, just not immediately accessible.
> (without having Python do lots of unnecessary work.)
>
> Jonathan
>
>
> Le dim. 19 juin 2022 à 15:34, Jonathan Fine  a
> écrit :
>
>> Hi
>>
>> This is a nice problem, well presented. Here's four comments / questions.
>>
>> 1. How does the introduction of faster CPython in Python 3.11 affect the
>> benchmarks?
>> 2. Is there an across-the-board change that would speedup this
>> line-offsets task?
>> 3. To limit splitlines memory use (at small performance cost), chunk the
>> input string into say 4 kb blocks.
>> 4. Perhaps anything done here for strings should also be done for bytes.
>>
>> --
>> Jonathan
>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/AETGT5HDF3QOFODOWKB4X45ZE4CZ7Y3M/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/FZ7V4FFKR45YLQDHTD2JZYEWZ5HEI3P2/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI 

[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Jonathan Fine
Hi

Some of us might believe that a currently legal syntax should only
exceptionally be given a new meaning, even if there is no evidence
whatsoever that this legal syntax is actually in use. My own belief is more
pragmatic. If there's very strong evidence that the syntax is not in use,
I'm happy to consider changing the meaning.

I wrote:

> The title of this thread includes the phrase 'Stop Iterating' (capitals
> added). This suggests the syntax
>   a, b, StopIterating = iterable
> where StopIterating is a new keyword that can be used only in this context.
>

In response Chris wrote:

> Hard no. That is currently-legal syntax, and it's also clunky.


Although
  a, b, StopIterating = iterable
is currently legal syntax, I believe that no-one has ever used it in Python
before today. My evidence is this search, which gives 25 pages.
https://www.google.com/search?q=%22stopiterating%22+python&nfpr=1

These pages found by this search do match "StopIterating", but do not
provide an example of their use in Python.
https://stackoverflow.com/questions/19892204/send-method-using-generator-still-trying-to-understand-the-send-method-and
https://julia-users.narkive.com/aD1Uin0y/implementing-an-iterator-which-conditionally-skips-elements

The following are copies of the stackoverflow page.
https://mlink.in/qa/?qa=810675/
https://www.796t.com/post/MmFubjI=.html
https://qa.1r1g.com/sf/ask/1392454311/

-- 
Jonathan
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QHQ2JKYWOF7IMN3QBQXKSR3AU74HC3FL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Chris Angelico
On Mon, 20 Jun 2022 at 21:19, Steven D'Aprano  wrote:
> > (4) The guarantee that a late-bound default WILL be executed at
> > function call time, can be useful, even essential (it could be
> > time-dependent or it could depend on the values - default or otherwise -
> > of other parameters whose values might be changed in the function
> > body).
>
> Okay. But a generalised lazy evaluation mechanism can be used to
> implement PEP 671 style evaluation.
>
> Let me see if I can give a good analogy... generalised lazy evaluation
> is like having a car that can drive anywhere there is a road, at any
> time of the day or night. Late-bound defaults is like having a car that
> can only drive to the local mall and back, and only on Thursdays.
>
> That's okay if you want to drive to the local mall on Thursdays, but if
> you could only have one option, which would be more useful?
>

Nice analogy. It doesn't hold up.

Consider this function:

def f(stuff, max=>len(stuff)):
stuff.append(1)
print(max)

f([1,2,3])

How would you use lazy evaluation to *guarantee* the behaviour here?

The only way I can imagine doing it is basically the same as I'm
doing: that late-bound argument defaults *have special syntax and
meaning to the compiler*. If they were implemented with some sort of
lazy evaluation object, they would need (a) access to the execution
context, so you can't just use a function; (b) guaranteed evaluation
on function entry, regardless of when - if ever - it gets referred to;
and (c) the ability to put it in the function header. The only one of
those that overlaps with lazy evaluation is (c).

Please stop arguing this point. It is a false analogy and until you
can demonstrate *with code* that there is value in doing it, it is a
massive red herring.

Even if Python does later on grow a generalized lazy evaluation
feature, it will only change the *implementation* of late-bound
argument defaults, not their specification.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2KA7SZAEWNH55HNEMK6H3UHWC5TZTS2Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Chris Angelico
On Mon, 20 Jun 2022 at 21:11, Jonathan Fine  wrote:
>
> Hi
>
> Some have liked adding a new syntax
> a, b, ... = iterable
> to mean consume two items from the iterable. However,
>a, b, Ellipsis = iterable
> has a different meaning (at least in Python 3.8). It redefines Ellipsis. (As 
> an explicit constant, '...' can be redefined.)

To clarify: The syntactic token '...' will always refer to the special
object Ellipsis (at least back as far as Python 3.4 - can't remember
when it became available in all contexts), but the name Ellipsis can
be rebound. So even though, in many contexts, "x = Ellipsis" and "x =
..." will have the same net effect, they are distinct (one is a name
lookup and the other is a constant), and they're definitely different
in assignment.

(Though it wouldn't surprise me if a future Python release adds
Ellipsis to the set of non-assignable names, with None/True/False.)

> The syntax
>   a, b, ... = iterable
> so to speak fills a gap in existing syntax, as the construct is at present 
> invalid. I actually like gaps in syntax, for the same reason that I like a 
> central reservation in a highway. The same goes for the hard shoulder / 
> breakdown lane.
>
> The title of this thread includes the phrase 'Stop Iterating' (capitals 
> added). This suggests the syntax
>   a, b, StopIterating = iterable
> where StopIterating is a new keyword that can be used only in this context.
>
> I'd like to know what others think about this suggestion.
>

Hard no. That is currently-legal syntax, and it's also clunky. I'd
much rather the "assign to ..." notation than a weird new soft keyword
that people are going to think is a typo for StopIteration.

It's worth noting that the proposed syntax has a slight distinction
from the normal asterisk notation, in that it makes perfect sense to
write this:

a, *_, b = thing

but does not make sense to write this:

a, ..., b = thing

as the "don't iterate over this thing" concept doesn't work here.
(Supporting this would require some way to reverse the iterator, and
that's not a language guarantee.)

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SYZK4GBLMCKWQK4Z5OEJWGYORBWPBU6K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread Chris Angelico
On Mon, 20 Jun 2022 at 21:11, Mathew Elman  wrote:
>
> This I like - it seems very intuitive, almost like an irreversible io stream.
>
> I don't know if there would be cases where this would lead to unexpected 
> bugs, but without looking into it it seems nice.
>
> Question: What would be the natural behaviour for negative indices? Raising 
> an error?

Please quote the person and text that you're responding to (and then
add your response underneath). Otherwise we have to guess which
(sub)proposal it is that you like.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NAW6SYGL4IAV5YSJAWHIKRG7WOLNGHUW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Steven D'Aprano
On Sun, Jun 19, 2022 at 02:21:16AM +0100, Rob Cliffe via Python-ideas wrote:

> Sorry, but I think all this talk about lazy evaluation is a big red herring:
>     (1) Python is not Haskell or Dask.

Python is not Haskell, but we stole list comprehensions and pattern 
matching from it. Python steals concepts from many languages.

And Python might not be Dask, but Dask is Python.

https://www.dask.org/


>     (2) Lazy evaluation is something Python doesn't have,

Python has lazily evaluated sequences (potentially infinite sequences) 
via generators and iterators. We also have short-circuit evaluation, 
which is a form of lazy evaluation. There may be other examples as well.

We may also get lazy importing soon:

https://peps.python.org/pep-0690/

At last one of Python's direct competitors in the scientific community, 
R, has lazy evaluation built in.


> and would be 
> a HUGE amount of work for Chris (or anyone) to implement

I don't know how hard it is to implement lazy evaluation, but speaking 
with the confidence of the ignorant, I expect not that hard if you don't 
care too much about making it super efficient. A lazy expression, or 
thunk, is basically just a zero-argument function that the interpreter 
knows to call.

If you don't care about getting Haskell levels of efficiency, that's 
probably pretty simple to implement.

Rewriting Python from the ground up to be completely lazy like Haskell 
would be a huge amount of work. Adding some sort of optional and 
explicit laziness, like R and F# and other languages use, would possibly 
be little more work than just adding late-bound defaults.

Maybe.


> And in the unlikely event 
> that Chris (or someone) DID implement it, I expect there would be a 
> chorus of "No, no, that's not how (I think) it should work at all".

The idea is that you plan your feature's semantics before writing an 
implementation. Even if you plan to "write one to throw away", and do 
exploratory coding, you should still have at least a vague idea of the 
desired semantics before you write a single line of code.


>     (3)  Late-bound defaults that are evaluated at function call time, 
> as per PEP 671, give you an easy way of doing something that at present 
> needs one of a number of workarounds (such as using sentinel values) all 
> of which have their drawbacks or awkward points.

Yes, we've read the PEP thank you :-)

Late-bound defaults also have their own drawbacks. It is not a question 
of whether this PEP has any advantages. It clearly does! The question is 
where the balance of pros versus cons falls.


>     (4) The guarantee that a late-bound default WILL be executed at 
> function call time, can be useful, even essential (it could be 
> time-dependent or it could depend on the values - default or otherwise - 
> of other parameters whose values might be changed in the function 
> body).

Okay. But a generalised lazy evaluation mechanism can be used to 
implement PEP 671 style evaluation.

Let me see if I can give a good analogy... generalised lazy evaluation 
is like having a car that can drive anywhere there is a road, at any 
time of the day or night. Late-bound defaults is like having a car that 
can only drive to the local mall and back, and only on Thursdays.

That's okay if you want to drive to the local mall on Thursdays, but if 
you could only have one option, which would be more useful?

> Sure, I appreciate that there are times when you might want to 
> defer the evaluation because it is expensive and might not be needed, but:
>     (5) If you really want deferred evaluation of a parameter default, 
> you can achieve that by explicitly evaluating it, *at the point you want 
> it*, in the function body.  Explicit is better than implicit.

That's not really how lazy evaluation works or why people want it.

The point of lazy evaluation is that computations are transparently and 
automatically delayed until you actually need them. Lazy evaluation is 
kind of doing the same thing for CPUs as garbage collection does for 
memory. GC kinda sorta lets you pretend you have infinite memory (so 
long as you don't actually try to use it all at once...). Lazy 
evaluation kinda sorta lets you pretend your CPU is infinitely fast (so 
long as you don't try to actually do too much all at once).

If you think about the differences between generators and lists, that 
might help. A generator isn't really like a list that you just evaluate 
a few lines later. Its a completely different way of thinking about 
code, and often (but not always) better.


> IMO lazy evaluation IS a different, orthogonal proposal.

Late-bound defaults is a very small subset of lazy evaluation.

But yes, lazy evaluation is a different, bigger concept.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived 

[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Jonathan Fine
Hi

Some have liked adding a new syntax
a, b, ... = iterable
to mean consume two items from the iterable. However,
   a, b, Ellipsis = iterable
has a different meaning (at least in Python 3.8). It redefines Ellipsis.
(As an explicit constant, '...' can be redefined.)

The syntax
  a, b, ... = iterable
so to speak fills a gap in existing syntax, as the construct is at present
invalid. I actually like gaps in syntax, for the same reason that I like a
central reservation in a highway. The same goes for the hard shoulder /
breakdown lane.

The title of this thread includes the phrase 'Stop Iterating' (capitals
added). This suggests the syntax
  a, b, StopIterating = iterable
where StopIterating is a new keyword that can be used only in this context.

I'd like to know what others think about this suggestion.

-- 
Jonathan
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OJXWLMYQX4LX2GOMVHFPWZYFQECAKSS2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread Mathew Elman
This I like - it seems very intuitive, almost like an irreversible io stream.

I don't know if there would be cases where this would lead to unexpected bugs, 
but without looking into it it seems nice. 

Question: What would be the natural behaviour for negative indices? Raising an 
error?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BNVFN7B5TYH6RTYD3GZI7AFIJRJVX4I5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Paul Moore
On Mon, 20 Jun 2022 at 11:08, Steven D'Aprano  wrote:

> But that's basically islice. So:
>
> # Its okay to put reusable helper functions in a module.
> # Not everything has to be syntax.
> first, second, third = itertools.islice(items, 3)
>
> I think that we have a working solution for this problem; the only
> argument is whether or not that problem is common enough, or special
> enough, or the solution clunky enough, to justify a syntax solution.

I think there's a lot of people (I'm not one of them) who prefer
working with syntax rather than functions for "basic operations". Of
course, what's "basic" is up for debate, but Lucas Wiman commented
earlier "I tend to like syntax over methods for handling basic data
types", and while I don't necessarily agree, I can see how people
gravitate towards asking for syntax when built in data types are
involved.

In this case, there's also the need to explicitly state the count,
which can be inferred from the LHS when using syntax, but not in a
function call. And the (perceived or real?) performance issue with
"function calls are slow".

Ultimately, this type of proposal is mostly decided by a judgement on
"what do we want the language to look like", which attracts subjective
comments like "Python isn't Perl", or "it's a natural extension of
existing syntax", or "it's more readable". But no-one here has the
authority to declare what is or is not "Pythonic" - that authority is
with the steering council. So we do our best to reach some sort of
group consensus, and dump the hard questions on the SC (via a PEP).

My sense is that a lot more people are coming to Python these days
with an expectation that syntax-based solutions are OK, and the "old
guard" (like myself!) are pushing more for the "not everything has to
be syntax" arguments. Maybe I'm not sufficiently self-aware, and when
I was newer to Python I too liked the idea of adding syntax more. I
honestly can't remember (I did love list comprehensions when they were
added, so I clearly wasn't always against syntax!). But I do think
that the broad question of "should Python have more complex syntax" is
probably a more fundamental debate that we won't resolve here.

For the record, I think the islice solution is sufficient for this
case. But I have needed this sort of thing occasionally, and islice
didn't immediately come to mind - so I have sympathy with the
discoverability argument. If a syntax like "a, b, *... =
some_iterator" existed, I suspect I'd use it. But picking a syntax
that *didn't* mislead me into assuming the iterator was fully consumed
would be hard - I thought *... was OK, but writing it just now I
realised I had to remind myself that it didn't consume everything, to
the point where I'd probably add a comment if I was writing the code.

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MQDFG66K4FA3TBOQ32N2WESYGLGXXQTN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Steven D'Aprano
On Sun, Jun 19, 2022 at 11:03:45PM -0700, Jeremiah Paige wrote:

> What if next grew a new argument? Changing the signature of a builtin is a
> big change, but surely not bigger than new syntax? If we could ask for the
> number  of items returned the original example might look like
> 
> >>> first, second = next(iter(items), count=2)

There are times where "Not everything needs to be a one liner" applies.

# You can skip the first line if you know items is already an iterator.
it = iter(items)
first, second, third = (next(it) for i in range(3))

That's crying out to be made into a helper function. Otherwise our 
one-liner is:

# Its okay to hate me for this :-)
first, second, third = (lambda obj: (it:=iter(obj)) and (next(it) for i in 
range(3)))(items)

But that's basically islice. So:

# Its okay to put reusable helper functions in a module.
# Not everything has to be syntax.
first, second, third = itertools.islice(items, 3)

I think that we have a working solution for this problem; the only 
argument is whether or not that problem is common enough, or special 
enough, or the solution clunky enough, to justify a syntax solution.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3G4VAU3UWM7OXUO2VRYM2I2BKZBIHBIW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Lucas Wiman
Some background. PEP 3132 (https://peps.python.org/pep-3132/) lists the
following:
>
> Possible changes discussed were:
>
>
>- Only allow a starred expression as the last item in the exprlist.
>This would simplify the unpacking code a bit and allow for the starred
>expression to be assigned an iterator. This behavior was rejected because
>it would be too surprising.
>
> This seems to be a reference to this message:
https://mail.python.org/pipermail/python-3000/2007-May/007299.html

Guido van Rossum said the following (
https://mail.python.org/pipermail/python-3000/2007-May/007378.html):

> The important use case in Python for the proposed semantics is when
> you have a variable-length record, the first few items of which are
> interesting, and the rest of which is less so, but not unimportant.
> (If you wanted to throw the rest away, you'd just write a, b, c =
> x[:3] instead of a, b, c, *d = x.)
>

There was also discussion about retaining the type of the object on the
RHS, e.g.:
c, *rest = "chair"  # c="r", rest="hair"
it = iter(range(10))
x, *rest = it  # c=1, rest is a reference to `it`

That proposal was rejected because the types were too confusing, e.g.:
header, *lines = open("some-file", "r")  # lines is an iterator
header, *lines, footer = open("some-file", "r")  # lines is a list

van Rossum later said (
https://mail.python.org/pipermail/python-3000/2007-May/007391.html):

> From an implementation POV, if you have an unknown object on
> the RHS, you have to try slicing it before you try iterating over it;
> this may cause problems e.g. if the object happens to be a defaultdict
> -- since x[3:] is implemented as x[slice(None, 3, None)], the
> defaultdict will give you its default value. I'd much rather define
> this in terms of iterating over the object until it is exhausted,
> which can be optimized for certain known types like lists and tuples.
>

It seems like these objections don't apply in this case, if we define a
syntax that explicitly says not to assign anything. There is no
inconsistency in the types there. E.g. in the proposal here:

header, *... = open("some-file", "r")
header, *..., footer = open("some-file", "r")

It's clear that to compute what the footer is, you would need to iterate
over the whole file, whereas you don't in the first one.
So historically, the idea here was discussed and rejected, but for a reason
which does not apply in this case.

===

Regarding utility, there are many sort of ugly ways of doing this with
method calls, especially from itertools. I tend to like syntax over methods
for handling basic data types. This is partly because it's more readable:
almost any method which takes more than one positional argument introduces
cognitive load because you have to remember what the order of the arguments
are and what they mean. You can add keyword arguments to improve
readability, but then it's more characters and you have to remember the
name or have it autocompleted. So if there is a simple way to support a use
case with simple built-in syntax, it can improve the utility of the
language.

Like honestly, does anyone remember the arguments to `islice`? I'm fairly
sure I've had to look it up every single time I've ever used it. For
iterator-heavy code, this might be multiple times on the same day. For the
`next(iterator, [default], count=1)` proposal, it's very easy to write
incorrect code that might look correct, e.g. `next(iterator, 3)`. Does 3
refer to the count or the default? If you've written python for years, it's
clear, but less clear to a novice.

There are efficiency arguments too: method calls are expensive, whereas
bytecode calls can be much more optimized. If you're already using
iterators, efficiency is probably relevant:
>>> import dis
>>> from itertools import islice
>>> def first_two_islice(it):
... return tuple(islice(it, 2))
...
>>> def first_two_destructuring(it):
... x, y, *rest = it
... return x, y
...
>>> dis.dis(first_two_islice)
  2   0 LOAD_GLOBAL  0 (tuple)
  2 LOAD_GLOBAL  1 (islice)
  4 LOAD_FAST0 (it)
  6 LOAD_CONST   1 (2)
  8 CALL_FUNCTION2
 10 CALL_FUNCTION1
 12 RETURN_VALUE
>>> dis.dis(first_two_destructuring)
  2   0 LOAD_FAST0 (it)
  2 UNPACK_EX2
  4 STORE_FAST   1 (x)
  6 STORE_FAST   2 (y)
  8 STORE_FAST   3 (rest)

  3  10 LOAD_FAST1 (x)
 12 LOAD_FAST2 (y)
 14 BUILD_TUPLE  2
 16 RETURN_VALUE

The latter requires no expensive CALL_FUNCTION operations, though it does
currently allocate rest pointlessly.

Personally, I think the main use case would be for handling large lists in
a memory efficient and readable manner. Currently using *_ mean

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Stephen J. Turnbull
@Chris

My bottom line, as I wrote before, is that even if this were
introduced, I probably will continue to default to

def foo(arg=None):
if arg is None:
arg = default

in my own code until I start seeing "def foo(arg=>default)" in a lot
of code I read.  Since Mailman generally supports about 4 Python
versions, that means I won't see it in Mailman until 2027 or so.

But I'm not George Bush to say "Read my lips: no new (syn)taxes!"

Unless somebody comes up with some new really interesting use case, I
think the suggestion somebody (sorry to somebody!) made earlier to
"Just Do It" and submit to the SC is the right one.  Both David and I
are convinced that there is value-added in late binding for new
mutables and defaults that are computed from actual arguments, even if
we're not convinced it's enough.  The proposal has plenty of fans, who
*are* convinced and *will* use it.  I don't see a prospect for that
new really interesting use case, at least not here on Python-Ideas,
the discussion is just variations on the same themes.  On the other
hand, a PEP under consideration may get a little more interest from
the Python-Dev crowd, and obviously the SC itself.  They may have use
cases or other improvements to offer.

"Now is better than never."  The SC will let you know if the companion
koan is applicable. ;-)

@Chris You may or may not want to read my variations on the themes. ;-)

Chris Angelico writes:

 > > In the numeric stuff, if I have:
 > >
 > > newarray = (A @ B) | (C / D) + (E - F)
 > >
 > >
 > > That's @, |, /, +, and -.  So 5 operators, and 25 "complexity
 > > points".  If I added one more operator, 36 "complexity points"
 > > seems reasonable.  And if I removed one of those operators, 16
 > > "complexity points" feels about right.
 > 
 > For my part, I would say that it's quite the opposite. This is three
 > parenthesized tokens, each of which contains two things combined in a
 > particular way. That's six 'things' combined in particular ways.
 > Cognitive load is very close to this version:
 > 
 > newarray = (A * B) + (C * D) + (E * F)

I don't have the studies offhand, but "7 plus or minus 2" is famous
enough, google that and you'll find plenty.  I'll bet you even find
"cognitive complexity of mathematical formulae" in the education
literature.  (And if not, we should sue all the Departments of
Education in the world for fraud. ;-)

I do have the words: "this is a sum of binary products".  This
basically reduces the cognitive complexity to two concepts plus a scan
of the list of variables.  Given that they're actually in alphabetical
order, "first variable is A" is enough to reproduce the expression.

That's much simpler than trying to describe David's 5-operator case
with any degree of specificity.  Or even just try to reproduce his
formula without a lot of effort to memorize it!  Also, just from the
regularity of the form and its expression as an algebraic formula, I
can deduce that almost certainly A, C, and E have the same type, and
B, D, and F have the same type, and very likely those two types are
the same.  Not so for the five-operator case, where I would be
surprised if less than 3 types were involved.

Of course, this type information is probably redundant.  I probably
remember not only the types, but lots of other attributes of A
through F.  But this kind of redundancy is good!  It reinforces my
understanding of the expression and the program that surrounds it.

 > even though this uses a mere two operators. It's slightly more, but
 > not multiplicatively so. (The exact number of "complexity points" will
 > depend on what A through F represent, but the difference between "all
 > multiplying and adding" and "five distinct operators" is only about
 > three points.)

That may be true for you, but it's definitely not true for my
economics graduate students.

 > > Sure, knowing what `hi` defaults to *could be useful*.  I'm sure
 > > if I used that function I would often want to know... and also
 > > often just assume the default is "something sensible."  I just
 > > don't think that "could be useful" as a benefit is nearly as
 > > valuable as the cost of a new sigil and a new semantics adding to
 > > the cognitive load of Python.
 > 
 > Yes, but "something sensible" could be "len(stuff)", "len(stuff)-1",
 > or various other things. Knowing exactly which of those will tell you
 > exactly how to use the function.

@David:  I find the "hi=len(stuff)" along with the "lst=[]" examples
fairly persuasive (maybe moves me to +/- 0).

@Chris:  It would be a lot more persuasive if you had a plausible
explicit list of "various other things".  Even "len(stuff) - 1" is
kind of implausible, given Python's consistent 0-based indexing
and closed-open ranges (yeah, I know some people like to use the
largest value in the range rather than the least upper bound not
in the range, but I consider that bad style in Python, and they
denote the same semantics).  And