Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Chris Angelico
On Thu, Apr 12, 2018 at 11:59 AM, Steven D'Aprano  wrote:
> On Thu, Apr 12, 2018 at 12:09:38AM +1000, Chris Angelico wrote:
>
> [...]
>> >> Consider a very common use-case: an OS-provided
>> >> Python interpreter whose files are all owned by 'root'. Those will be
>> >> distributed with .pyc files for performance, but you don't want to
>> >> deprive the users of help() and anything else that needs docstrings
>> >> etc. So... are the docstrings lazily loaded or eagerly loaded?
>> >
>> > What relevance is that they're owned by root?
>>
>> You have to predict in advance what you'll want to have in your pyc
>> files. Can't create them on the fly.
>
> How is that different from the situation right now?

If the files aren't owned by root (more specifically, if they're owned
by you, and you can write to the pycache directory), you can do
everything at runtime. Otherwise, you have to do everything at
installation time.

>> > What semantic change do you expect?
>> >
>> > There's an implementation change, of course, but that's Serhiy's problem
>> > to deal with and I'm sure that he has considered that. There should be
>> > no semantic change. When you access obj.__doc__, then and only then are
>> > the compiled docstrings for that module read from the disk.
>>
>> In other words, attempting to access obj.__doc__ can actually go and
>> open a file. Does it need to check if the file exists as part of the
>> import, or does it go back to sys.path?
>
> That's implementation, so I don't know, but I imagine that the module
> object will have a link pointing directly to the expected file on disk.
> No need to search the path, you just go directly to the expected file.
> Apart from handling the case when it doesn't exist, in which case the
> docstring or annotations get set to None, it should be relatively
> straight-forward.
>
> That link could be an explicit pathname:
>
> /path/to/__pycache__/foo.cpython-33-doc.pyc
>
> or it could be implicitly built when required from the "master" .pyc
> file's path, since the differences are likely to be deterministic.

Referencing a path name requires that each directory in it be opened.
Checking to see if the file exists requires, at absolute best, one
more stat call, and that's assuming you have an open handle to the
directory.

>> If the former, you're right
>> back with the eager loading problem of needing to do 2-4 times as many
>> stat calls;
>
> Except that's not eager loading. When you open the file on demand, it
> might never be opened at all. If it is opened, it is likely to be a long
> time after interpreter startup.

I have no idea what you mean here. Eager loading != opening the file
on demand. Eager statting != opening on demand. If you're not going to
hold open handles to heaps of directories, you have to reference
everything by path name.

>> > As for the in-memory data structures of objects themselves, I imagine
>> > something like the __doc__ and __annotation__ slots pointing to a table
>> > of strings, which is not initialised until you attempt to read from the
>> > table. Or something -- don't pay too much attention to my wild guesses.
>> >
>> > The bottom line is, is there some reason *aside from performance* to
>> > avoid this? Because if the performance is worse, I'm sure Serhiy will be
>> > the first to dump this idea.
>>
>> Obviously it could be turned into just a performance question, but in
>> that case everything has to be preloaded
>
> You don't need to preload things to get a performance benefit.
> Preloading things that you don't need immediately and may never need at
> all, like docstrings, annotations and line numbers, is inefficient.

Right, and if you DON'T preload everything, you have a potential
semantic difference. Which is exactly what you were asking me, and I
was answering.

> So let's look at a few common scenarios:
>
>
> 1. You run a script. Let's say that the script ends up loading, directly
> or indirectly, 200 modules, none of which need docstrings or annotations
> during runtime, and the script runs to completion without needing to
> display a traceback. You save loading 200 sets of docstrings,
> annotations and line numbers ("metadata" for brevity) so overall the
> interpreter starts up quicker and the script runs faster.
>
>
> 2. You run the same script, but this time it raises an exception and
> displays a traceback. So now you have to load, let's say, 20 sets of
> line numbers, which is a bit slower, but that doesn't happen until the
> exception is raised and the traceback printed, which is already a slow
> and exceptional case so who cares if it takes an extra few milliseconds?
> It is still an overall win because of the 180 sets of metadata you
> didn't need to load.

Does this loading happen when the exception is constructed or when
it's printed? How much can you do with an exception without triggering
the loading of metadata? Is it now possible for the mere formatting of
a traceback to fail because of 

Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Steven D'Aprano
On Thu, Apr 12, 2018 at 12:09:38AM +1000, Chris Angelico wrote:

[...]
> >> Consider a very common use-case: an OS-provided
> >> Python interpreter whose files are all owned by 'root'. Those will be
> >> distributed with .pyc files for performance, but you don't want to
> >> deprive the users of help() and anything else that needs docstrings
> >> etc. So... are the docstrings lazily loaded or eagerly loaded?
> >
> > What relevance is that they're owned by root?
> 
> You have to predict in advance what you'll want to have in your pyc
> files. Can't create them on the fly.

How is that different from the situation right now?


> > What semantic change do you expect?
> >
> > There's an implementation change, of course, but that's Serhiy's problem
> > to deal with and I'm sure that he has considered that. There should be
> > no semantic change. When you access obj.__doc__, then and only then are
> > the compiled docstrings for that module read from the disk.
> 
> In other words, attempting to access obj.__doc__ can actually go and
> open a file. Does it need to check if the file exists as part of the
> import, or does it go back to sys.path? 

That's implementation, so I don't know, but I imagine that the module 
object will have a link pointing directly to the expected file on disk. 
No need to search the path, you just go directly to the expected file. 
Apart from handling the case when it doesn't exist, in which case the 
docstring or annotations get set to None, it should be relatively 
straight-forward.

That link could be an explicit pathname:

/path/to/__pycache__/foo.cpython-33-doc.pyc

or it could be implicitly built when required from the "master" .pyc 
file's path, since the differences are likely to be deterministic.


> If the former, you're right
> back with the eager loading problem of needing to do 2-4 times as many
> stat calls;

Except that's not eager loading. When you open the file on demand, it 
might never be opened at all. If it is opened, it is likely to be a long 
time after interpreter startup.


> > As for the in-memory data structures of objects themselves, I imagine
> > something like the __doc__ and __annotation__ slots pointing to a table
> > of strings, which is not initialised until you attempt to read from the
> > table. Or something -- don't pay too much attention to my wild guesses.
> >
> > The bottom line is, is there some reason *aside from performance* to
> > avoid this? Because if the performance is worse, I'm sure Serhiy will be
> > the first to dump this idea.
> 
> Obviously it could be turned into just a performance question, but in
> that case everything has to be preloaded

You don't need to preload things to get a performance benefit. 
Preloading things that you don't need immediately and may never need at 
all, like docstrings, annotations and line numbers, is inefficient.

I fear that you have completely failed to understand the (potential) 
performance benefit here.

The point, or at least *a* point, of the exercise is to speed up 
interpreter startup by deferring some of the work until it is needed. 
When you defer work, the pluses are that it reduces startup time, and 
sometimes you can avoid doing it at all; the minus is that if you do end 
up needing to do it, you have to do a little bit extra.

So let's look at a few common scenarios:


1. You run a script. Let's say that the script ends up loading, directly 
or indirectly, 200 modules, none of which need docstrings or annotations 
during runtime, and the script runs to completion without needing to 
display a traceback. You save loading 200 sets of docstrings, 
annotations and line numbers ("metadata" for brevity) so overall the 
interpreter starts up quicker and the script runs faster.


2. You run the same script, but this time it raises an exception and 
displays a traceback. So now you have to load, let's say, 20 sets of 
line numbers, which is a bit slower, but that doesn't happen until the 
exception is raised and the traceback printed, which is already a slow 
and exceptional case so who cares if it takes an extra few milliseconds? 
It is still an overall win because of the 180 sets of metadata you 
didn't need to load.


3. You have a long-running server application which runs for days or 
weeks between restarts. Let's say it loads 1000 modules, so you get 
significant savings during start up (let's say, hypothetically shaving 
off 2 seconds from a 30 second start up time), but over the course of 
the week it ends up eventually loading all 1000 sets of metadata. Since 
that is deferred until needed, it doesn't happen all at once, but spread 
out a little bit at a time.

Overall, you end up doing four times as many file system operations, but 
since they're amortized over the entire week, not startup, it is still a 
win.

(And remember that this extra cost only applies the first time a 
module's metadata is needed. It isn't a cost you keep paying over and 
over again.)

We're (hopefully!) not 

Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Thu, Apr 12, 2018 at 10:44 AM, Ethan Furman  wrote:
> On 04/11/2018 04:46 PM, Chris Angelico wrote:
>
>> For myself, I've been back and forth a bit about whether "as" or ":="
>> is the better option. Both of them have problems. Both of them create
>> edge cases that could cause problems. Since the problems caused by
>> ":=" are well known from other languages (and are less serious than
>> they would be if "=" were the operator), I'm pushing that form.
>> However, the 'as' syntax is a close contender (unlike most of the
>> other contenders), so if someone comes up with a strong argument in
>> its favour, I could switch.
>
>
> While I strongly prefer "as", if it can't be made to work in the grammar
> then that option is pretty much dead, isn't it?  In which case, I'll take
> ":=".
>

It can; and in fact, I have a branch where I had exactly that (with
the SLNB functionality as well):

https://github.com/Rosuav/cpython/tree/statement-local-variables

But it creates enough edge cases that I was swayed by the pro-:= lobby.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Nathan Schneider
On Wed, Apr 11, 2018 at 1:49 PM, Brendan Barnwell 
wrote:

> On 2018-04-11 05:23, Clint Hepner wrote:
>
>> I find the assignments make it difficult to pick out what the final
>> expression looks like.
>>
>
> I strongly agree with this, and for me I think this is enough to
> push me to -1 on the whole proposal.  For me the classic example case is
> still the quadratic formula type of thing:
>
> x1, x2 = (-b + sqrt(b**2 - 4*a*c))/2, (-b - sqrt(b**2 - 4*a*c))/2
>
> It just doesn't seem worth it to me to create an expression-level
> assignment unless it can make things like this not just less verbose but at
> the same time more readable.  I don't consider this more readable:
>
> x1, x2 = (-b + sqrt(D := b**2 - 4*a*c)))/2, (-b - sqrt(D))/2
> 
>

I'd probably write this as:

x1, x2 = [(-b + s*sqrt(b**2 - 4*a*c))/(2*a) for s in (1,-1)]

Agreed that the PEP doesn't really help for this use case, but I don't
think it has to. The main use cases in the PEP seem compelling enough to me.

Nathan
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Thu, Apr 12, 2018 at 3:49 AM, Brendan Barnwell  wrote:
> On 2018-04-11 05:23, Clint Hepner wrote:
>>
>> I find the assignments make it difficult to pick out what the final
>> expression looks like.
>
>
> I strongly agree with this, and for me I think this is enough to
> push me to -1 on the whole proposal.  For me the classic example case is
> still the quadratic formula type of thing:
>
> x1, x2 = (-b + sqrt(b**2 - 4*a*c))/2, (-b - sqrt(b**2 - 4*a*c))/2
>
> It just doesn't seem worth it to me to create an expression-level
> assignment unless it can make things like this not just less verbose but at
> the same time more readable.  I don't consider this more readable:
>
> x1, x2 = (-b + sqrt(D := b**2 - 4*a*c)))/2, (-b - sqrt(D))/2
>
> . . . because having to put the assignment inline creates a visual
> asymmetry, when for me the entire goal of an expression-level statement is
> to make the symmetry between such things MORE obvious.  I want to be able to
> write:
>
> x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 ...
>
> . . . where "..." stands for "the part of the expression where I define the
> variables I'm re-using in multiple places in the expression".

What if you want to use it THREE times?

roots = [((-b + sqrt(D))/2/a, (-b - sqrt(D))/2/a) for a,b,c in
triangles if (D := b**2 - 4*a*c) >= 0]

Now it's matching again, without any language changes. (I've
reinstated the omitted division by 'a', in case anyone's confused by
the translation. It has no bearing on the PEP discussion.) Same if
you're using an if statement.

> The new proposal does at least have the advantage that it would help
> with things like this:
>
> while x := some_function_call():
> # do stuff
>
> So maybe I'm -0.5 rather than -1.  But it's not just that this
> proposal "could be used to create ugly code".  It's that using it for
> expression-internal assignments WILL create ugly code, and there's no way to
> avoid it.  I just don't see how this proposal provides any way to make
> things like the quadratic formula example above MORE readable.

I don't think it's as terrible as you're saying. You've picked a
specific example that is ugly; okay. This new syntax is not meant to
*replace* normal assignment, but to complement it. There are times
when it's much better to use the existing syntax.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Thu, Apr 12, 2018 at 1:22 AM, Nick Coghlan  wrote:
>> # Similar to the boolean 'or' but checking for None specifically
>> x = "default" if (eggs := spam().ham) is None else eggs
>>
>> # Even complex expressions can be built up piece by piece
>> y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])
>>
>
> Leading with these kinds of examples really doesn't help to sell the
> proposal, since they're hard to read, and don't offer much, if any,
> benefit over the status quo where assignments (and hence the order of
> operations) need to be spelled out as separate lines.
>
> Instead, I'd suggestion going with the kinds of examples that folks
> tend to bring up when requesting this capability:

Cool, thanks. I've snagged these (and your other examples) and
basically tossed them into the PEP unchanged.

>> The name ``prefix`` is thus searched for at global scope, ignoring the class
>> name. Under the proposed semantics, this name will be eagerly bound, being
>> approximately equivalent to::
>>
>> class X:
>> names = ["Fred", "Barney", "Joe"]
>> prefix = "> "
>> def (prefix=prefix):
>> result = []
>> for name in names:
>> result.append(prefix + name)
>> return result
>> prefixed_names = ()
>
> "names" would also be eagerly bound here.

Yep, that was a clerical error on my part, now corrected.

>> Frequently Raised Objections
>> 
>
> There needs to be a subsection here regarding the need to call `del`
> at class and module scope, just as there is for loop iteration
> variables at those scopes.

Hmm, I'm not sure I follow. Are you saying that this is an objection
to assignment expressions, or an objection to them not being
statement-local? If the latter, it's really more about "rejected
alternative proposals".

>> This could be used to create ugly code!
>> ---
>>
>> So can anything else.  This is a tool, and it is up to the programmer to use 
>> it
>> where it makes sense, and not use it where superior constructs can be used.
>
> This argument will be strengthened by making the examples used in the
> PEP itself more attractive, as well as proposing suitable additions to
> PEP 8, such as:
>
> 1. If either assignment statements or assignment expressions can be
> used, prefer statements
> 2. If using assignment expressions would lead to ambiguity about
> execution order, restructure to use statements instead

Fair enough. Also adding that chained assignment expressions should
generally be avoided.

Thanks for the recommendations!

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Brendan Barnwell

On 2018-04-11 11:05, David Mertz wrote:

How about this, Brendan?

_, x1, x2 = (D := b**2 - 4*a*c), (-b + sqrt(D))/2, (-b - sqrt(D))/2

I'm not sure I love this, but I don't hate it.


That's clever, but why bother?  I can already do this with existing 
Python:

D = b**2 - 4*a*c
x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2

	If the new feature encourages people to do something like your example 
(or my earlier examples with the D definition inline in the expression 
for x1), then I'd consider that another mark against it.


--
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."

   --author unknown
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Paul Moore
On 11 April 2018 at 19:05, David Mertz  wrote:
> How about this, Brendan?
>
> _, x1, x2 = (D := b**2 - 4*a*c), (-b + sqrt(D))/2, (-b - sqrt(D))/2
>
> I'm not sure I love this, but I don't hate it.

Seriously, how is this in any way better than

D = b**2 - 4*a*c
x1, x2 = (-b + sqrt(D))/2, (-b - sqrt(D))/2

?

There are good use cases for this feature, but this simply isn't one.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Kirill Balunov
2018-04-11 18:01 GMT+03:00 Kirill Balunov :

>
>
> 2018-04-11 16:50 GMT+03:00 Chris Angelico :
>
>>
>> Can you give an example of how your syntax is superior to the more
>> general option of simply allowing "as" bindings in any location?
>>
>>
> This is not my syntax :) And not even my idea. I just do not understand,
> and even a little skeptical about allowing "as" bindings in **any
> location** with global scoping. All the examples in this thread and the
> previous ones, as well as almost all PEP's examples show how this feature
> will be useful in `if`, `while` statements and comprehension/generator
> expressions. And it excellently solves this problem.  This feature
> increases the capabilities of these statements and also positively affects
> the readability of the code and it seems to me that everyone understands
> what this means in this context without ambiguity in their meaning in
> `while` or `with` statements.
>
> The remaining examples (general ones) are far-fetched, and I do not have
> much desire to discuss them :)  These include:
>
> lambda: x := lambda y: z := a := 0
> y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])
> and others of these kind...
>
> Thus, I do not understand why to solve such a general and complex problem,
> when this syntax is convenient only in specific cases.  In addition,
> previously the concept of a Statement-Local Name Bindings was discussed, which
> I basically like (and it fits the above idea).  In this version, it was
> abandoned completely, but it is unclear for what reasons.
>
> p.s.: Maybe someone has use-cases outside `if`, `while` and
> comprehensions, but so far no one has demonstrated them.
>
>
I find that I wrote very vague, so I'll try in response to my answer to add
some specifics. In general, I find this idea missed in the language and
thank you for trying to fix this! In my opinion it has only a meaning in
certain constructions such as `while`, `if`, `elif` and maybe
comprehensions\generators. As a general form "anywhere" it can be _useful_,
but makes the code unreadable and difficult to perceive while giving not so
much benefit. What I find nice to have:

Extend while statement syntax:

 while (input("> ") as command) != "quit":
 print("You entered:", command)


Extend if statement syntax:



if re.search(pat, text) as match:
print("Found:", match.group(0))

if (re.search(pat, text) as match) is not None:
print("Found:", match.group(0))

also `elif` clauses should be extended to support.

Extend comprehensions syntax:

# Since comprehensions have an if clause
[y for x in data if (f(x) as y) is not None]

# Also this form without `if` clause

[(y, x/y) with f(x) as y for x in range(5)]

Extend ternary expression syntax:

data = y/x if (f(x) as y) > 0 else 0

I think that is all. And it seems to me that it covers 99% of all the
use-cases of this feature. In my own world I would like them to make a
local _statement_ binding (but this is certainly a very controversial
point). I even like that this syntax matches the `with` an `except`
statements syntax, although it has a different semantic. But I do not think
that anyone will have problems with perception of this.

With kind regards,
-gdg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Ethan Furman

On 04/10/2018 10:32 PM, Chris Angelico wrote:


Title: Assignment Expressions


Thank you, Chris, for doing all this!

---

Personally, I'm likely to only use this feature in `if` and `while` statements; if the syntax was easier to read inside 
longer expressions then I might use this elsewhere -- but as has been noted by others, the on-the-spot assignment 
creates asymmetries that further clutter the overall expression.


As Paul noted, I don't think parenthesis should be mandatory if the parser 
itself does not require them.

For myself, I prefer the EXPR as NAME variant for two reasons:

- puts the calculation first, which is what we are used to seeing in if/while 
statements; and
- matches already existing expression-level assignments (context managers, 
try/except blocks)

+0.5 from me.

--
~Ethan~
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Brendan Barnwell

On 2018-04-11 05:23, Clint Hepner wrote:

I find the assignments make it difficult to pick out what the final expression 
looks like.


	I strongly agree with this, and for me I think this is enough to push 
me to -1 on the whole proposal.  For me the classic example case is 
still the quadratic formula type of thing:


x1, x2 = (-b + sqrt(b**2 - 4*a*c))/2, (-b - sqrt(b**2 - 4*a*c))/2

	It just doesn't seem worth it to me to create an expression-level 
assignment unless it can make things like this not just less verbose but 
at the same time more readable.  I don't consider this more readable:


x1, x2 = (-b + sqrt(D := b**2 - 4*a*c)))/2, (-b - sqrt(D))/2

. . . because having to put the assignment inline creates a visual 
asymmetry, when for me the entire goal of an expression-level statement 
is to make the symmetry between such things MORE obvious.  I want to be 
able to write:


x1, x2 = (-b + sqrt(D)))/2, (-b - sqrt(D))/2 ...

. . . where "..." stands for "the part of the expression where I define 
the variables I'm re-using in multiple places in the expression".


	The new proposal does at least have the advantage that it would help 
with things like this:


while x := some_function_call():
# do stuff

	So maybe I'm -0.5 rather than -1.  But it's not just that this proposal 
"could be used to create ugly code".  It's that using it for 
expression-internal assignments WILL create ugly code, and there's no 
way to avoid it.  I just don't see how this proposal provides any way to 
make things like the quadratic formula example above MORE readable.


--
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."

   --author unknown
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Terry Reedy

On 4/11/2018 4:26 AM, Petr Viktorin wrote:

Currently in Fedora, we ship *both* optimized and non-optimized pycs to 
make sure both -O and non--O will work nicely without root privilieges. 
So splitting the docstrings into a separate file would be, for us, a 
benefit in terms of file size.


Currently, the Windows installer has an option to pre-compile stdlib 
modules.  (At least it does if one does an all-users installation.)  If 
one selects this, it creates normal, -O, and -OO versions of each. 
Since, like most people, I never run with -O or  -OO, replacing this 
redundancy with 1 segmented file or 2 non-redundant files might be a win 
for most people.


--
Terry Jan Reedy

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three!

2018-04-11 Thread MRAB

On 2018-04-11 04:15, Mike Miller wrote:

If anyone is interested I came across this same subject on a blog post and
discussion on HN today:

- https://www.hillelwayne.com/post/equals-as-assignment/


It says "BCPL also introduced braces as a means of defining blocks.".

That bit is wrong, unless "braces" is being used as a generic term. BCPL 
used $( and $).

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Paul Moore
On 11 April 2018 at 16:22, Nick Coghlan  wrote:
> Similar to my suggestion above, you may also want to consider making
> this example a filtered comprehension in order to show the proposal in
> its best light:
>
> results = [(x, y, x/y) for x in input_data if (y := f(x) )]

Agreed, this is a *much* better motivating example.

>> This could be used to create ugly code!
>> ---
>>
>> So can anything else.  This is a tool, and it is up to the programmer to use 
>> it
>> where it makes sense, and not use it where superior constructs can be used.
>
> This argument will be strengthened by making the examples used in the
> PEP itself more attractive, as well as proposing suitable additions to
> PEP 8, such as:
>
> 1. If either assignment statements or assignment expressions can be
> used, prefer statements
> 2. If using assignment expressions would lead to ambiguity about
> execution order, restructure to use statements instead

+1 on explicitly suggesting additions to PEP 8. Bonus points for PEP 8
additions that can be automatically checked by linters/style checkers
(For example "avoid chained assignment expressions").

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Nick Coghlan
On 11 April 2018 at 15:32, Chris Angelico  wrote:
> Wholesale changes since the previous version. Statement-local name
> bindings have been dropped (I'm still keeping the idea in the back of
> my head; this PEP wasn't the first time I'd raised the concept), and
> we're now focusing primarily on assignment expressions, but also with
> consequent changes to comprehensions.

Thanks for putting this revised version together! You've already
incorporated my feedback on semantics, so my comments below are mostly
about the framing of the proposal in the context of the PEP itself.

> Syntax and semantics
> 
>
> In any context where arbitrary Python expressions can be used, a **named
> expression** can appear. This can be parenthesized for clarity, and is of
> the form ``(target := expr)`` where ``expr`` is any valid Python expression,
> and ``target`` is any valid assignment target.
>
> The value of such a named expression is the same as the incorporated
> expression, with the additional side-effect that the target is assigned
> that value.
>
> # Similar to the boolean 'or' but checking for None specifically
> x = "default" if (eggs := spam().ham) is None else eggs
>
> # Even complex expressions can be built up piece by piece
> y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])
>

Leading with these kinds of examples really doesn't help to sell the
proposal, since they're hard to read, and don't offer much, if any,
benefit over the status quo where assignments (and hence the order of
operations) need to be spelled out as separate lines.

Instead, I'd suggestion going with the kinds of examples that folks
tend to bring up when requesting this capability:

# Handle a matched regex
if (match := pattern.search(data)) is not None:
...

# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None:
...

# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]

All three of those examples share the common characteristic that
there's no ambiguity about the order of operations, and the latter two
aren't amenable to simply being split out into separate assignment
statements due to the fact they're part of a loop.

A good proposal should have readers nodding to themselves and thinking
"I could see myself using that construct, and being happy about doing
so", rather than going "Eugh, my eyes, what did I just read?" :)

> The name ``prefix`` is thus searched for at global scope, ignoring the class
> name. Under the proposed semantics, this name will be eagerly bound, being
> approximately equivalent to::
>
> class X:
> names = ["Fred", "Barney", "Joe"]
> prefix = "> "
> def (prefix=prefix):
> result = []
> for name in names:
> result.append(prefix + name)
> return result
> prefixed_names = ()

"names" would also be eagerly bound here.

> Recommended use-cases
> =
>
> Simplifying list comprehensions
> ---
>
> These list comprehensions are all approximately equivalent::
>
> # Calling the function twice
> stuff = [[f(x), x/f(x)] for x in range(5)]
>
> # External helper function
> def pair(x, value): return [value, x/value]
> stuff = [pair(x, f(x)) for x in range(5)]
>
> # Inline helper function
> stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
>
> # Extra 'for' loop - potentially could be optimized internally
> stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
>
> # Iterating over a genexp
> stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))]
>
> # Expanding the comprehension into a loop
> stuff = []
> for x in range(5):
> y = f(x)
> stuff.append([y, x/y])
>
> # Wrapping the loop in a generator function
> def g():
> for x in range(5):
> y = f(x)
> yield [y, x/y]
> stuff = list(g())
>
> # Using a mutable cache object (various forms possible)
> c = {}
> stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
>
> # Using a temporary name
> stuff = [[y := f(x), x/y] for x in range(5)]

The example using the PEP syntax could be listed first in its own
section, and then the others given as "These are the less obvious
alternatives that this new capability aims to displace".

Similar to my suggestion above, you may also want to consider making
this example a filtered comprehension in order to show the proposal in
its best light:

results = [(x, y, x/y) for x in input_data if (y := f(x) )]

> Capturing condition values
> --
>
> Assignment expressions can be used to good effect in the header of
> an ``if`` or ``while`` statement::

Similar to the comprehension section, I think 

Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Kirill Balunov
2018-04-11 16:50 GMT+03:00 Chris Angelico :

>
> Can you give an example of how your syntax is superior to the more
> general option of simply allowing "as" bindings in any location?
>
>
This is not my syntax :) And not even my idea. I just do not understand,
and even a little skeptical about allowing "as" bindings in **any
location** with global scoping. All the examples in this thread and the
previous ones, as well as almost all PEP's examples show how this feature
will be useful in `if`, `while` statements and comprehension/generator
expressions. And it excellently solves this problem.  This feature
increases the capabilities of these statements and also positively affects
the readability of the code and it seems to me that everyone understands
what this means in this context without ambiguity in their meaning in
`while` or `with` statements.

The remaining examples (general ones) are far-fetched, and I do not have
much desire to discuss them :)  These include:

lambda: x := lambda y: z := a := 0
y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])
and others of these kind...

Thus, I do not understand why to solve such a general and complex problem,
when this syntax is convenient only in specific cases.  In addition,
previously the concept of a Statement-Local Name Bindings was discussed, which
I basically like (and it fits the above idea).  In this version, it was
abandoned completely, but it is unclear for what reasons.

p.s.: Maybe someone has use-cases outside `if`, `while` and comprehensions,
but so far no one has demonstrated them.

With kind regards,
-gdg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin

2018-04-11 Thread Peter O'Connor
>
> It's worth adding a reminder here that "having more options on the
> market" is pretty directly in contradiction to the Zen of Python -
> "There should be one-- and preferably only one --obvious way to do
> it".


I've got to start minding my words more.  By "options on the market" I more
meant it in a "candidates for the job" sense.  As in in the end we'd select
just one, which would in retrospect or if Dutch would seem like the obvious
choice.  Not that "everyone who uses Python should have more ways to do
this".

My reason for starting this is that there isn't "one obvious way" to do
this type of operation now (as the diversity of the exponential-moving-average
"zoo"

attests)

--

Let's look at a task where there is "one obvious way"

Suppose someone asks: "How can I build a list of squares of the first 100
odd numbers [1, 9, 25, 49, ] in Python?"  The answer is now obvious -
few people would do this:

list_of_odd_squares = []
for i in range(100):
list_of_odd_squares.append((i*2+1)**2)

or this:

def iter_odd_squares(n)):
for i in range(n):
yield (i*2+1)**2

list_of_odd_squares = list(iter_odd_squares(100))

Because it's just more clean, compact, readable and "obvious" to do:

list_of_even_squares = [(i*2+1)**2 for i in range(100)]

Maybe I'm being presumptuous, but I think most Python users would agree.

---

Now lets switch our task computing the exponential moving average of a
list.  This is a stand-in for a HUGE range of tasks that involve carrying
some state-variable forward while producing values.

Some would do this:

smooth_signal = []
average = 0
for x in signal:
average = (1-decay)*average + decay*x
smooth_signal.append(average)

Some would do this:

def moving_average(signal, decay, initial=0):
average = initial
for x in signal:
average = (1-decay)*average + decay*x
yield average

smooth_signal = list(moving_average(signal, decay=decay))

Lovers of one-liners like Serhiy would do this:

smooth_signal = [average for average in [0] for x in signal for average
in [(1-decay)*average + decay*x]]

Some would scoff at the cryptic one-liner and do this:

def update_moving_average(avg, x, decay):
return (1-decay)*avg + decay*x

smooth_signal = list(itertools.accumulate(itertools.chain([0], signal),
func=functools.partial(update_moving_average, decay=decay)))

And others would scoff at that and make make a class, or use coroutines.

--

There've been many suggestions in this thread (all documented here:
https://github.com/petered/peters_example_code/blob/master/peters_example_code/ways_to_skin_a_cat.py)
and that's good, but it seems clear that people do not agree on an
"obvious" way to do things.

I claim that if

smooth_signal = [average := (1-decay)*average + decay*x for x in signal
from average=0.]

Were allowed, it would become the "obvious" way.

Chris Angelico's suggestions are close to this and have the benefit of
requiring no new syntax in a PEP 572 world :

smooth_signal = [(average := (1-decay)*average + decay*x) for average
in [0] for x in signal]
or
smooth_signal = [(average := (1-decay)*(average or 0) + decay*x) for x
in signal]
or
   average = 0
   smooth_signal = [(average := (1-decay)*average + decay*x) for x in
signal]

But they all have oddities that detract from their "obviousness" and the
oddities stem from there not being a built-in way to initialize.  In the
first, there is the odd "for average in [0]" initializer..   The second
relies on a hidden "average = None" which is not obvious at all, and the
third has the problem that the initial value is bound to the defining scope
instead of belonging to the generator.  All seem to have oddly redundant
brackets whose purpose is not obvious, but maybe there's a good reason for
that.

If people are happy with these solutions and still see no need for the
initialization syntax, we can stop this, but as I see it there is a "hole"
in the language that needs to be filled.

On Wed, Apr 11, 2018 at 3:55 AM, Paul Moore  wrote:

> On 11 April 2018 at 04:41, Steven D'Aprano  wrote:
> >> > But in a way that more intuitively expresses the intent of the code,
> it
> >> > would be great to have more options on the market.
> >>
> >> It's worth adding a reminder here that "having more options on the
> >> market" is pretty directly in contradiction to the Zen of Python -
> >> "There should be one-- and preferably only one --obvious way to do
> >> it".
> >
> > I'm afraid I'm going to (mildly) object here. At least you didn't
> > misquote the Zen as "Only One Way To Do It" :-)
> >
> > The Zen here is not a prohibition against there being multiple ways to
> > do something -- how could it, given that Python is a general purpose
> > 

Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Thu, Apr 12, 2018 at 12:11 AM, Paul Moore  wrote:
> On 11 April 2018 at 14:54, Chris Angelico  wrote:
>> Sure, if you're just assigning zero to everything. But you could do
>> that with a statement. What about this:
>>
>> q = {
>> lambda: x := lambda y: z := a := 0,
>> }
>>
>> Yes, it's an extreme example, but look at all those colons and tell me
>> if you can figure out what each one is doing.
>
> lambda: x := (lambda y: (z := (a := 0)))
>
> As I say, it's the only *possible* parsing. It's ugly, and it
> absolutely should be parenthesised, but there's no need to make the
> parentheses mandatory. (And actually, it didn't take me long to add
> those parentheses, it's not *hard* to parse correctly - for a human).

Did you pick up on the fact that this was actually in a set? With very
small changes, such as misspelling "lambda" at the beginning, this
actually becomes a dict display. How much of the expression do you
need to see before you can be 100% sure of the parsing? Could you do
this if fed tokens one at a time, with permission to look no more than
one token ahead?

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add more information in the header of pyc files

2018-04-11 Thread Nick Coghlan
On 11 April 2018 at 02:54, Antoine Pitrou  wrote:
> On Tue, 10 Apr 2018 19:29:18 +0300
> Serhiy Storchaka 
> wrote:
>>
>> A bugfix release can fix bugs in bytecode generation. See for example
>> issue27286. [1]  The part of issue33041 backported to 3.7 and 3.6 is an
>> other example. [2]  There were other examples of compatible changing the
>> bytecode. Without bumping the magic number these fixes can just not have
>> any effect if existing pyc files were generated by older compilers. But
>> bumping the magic number in a bugfix release can lead to rebuilding
>> every pyc file (even unaffected by the fix) in distributives.
>
> Sure, but I don't think rebuilding every pyc file is a significant
> problem.  It's certainly less error-prone than cherry-picking which
> files need rebuilding.

And we need to handle the old bytecode format in the eval loop anyway,
or else we'd be breaking compatibility with bytecode-only files, as
well as introducing a significant performance regression for
non-writable bytecode caches (if we were to ignore them).

It's a subtle enough problem that I think the `compileall --force`
option is a safer way of handling it, even if it regenerates some pyc
files that could have been kept.

For the "stable file signature" aspect, does that need to be
specifically the first *four* bytes? One of the benefits of PEP 552
leaving those four bytes alone is that it meant that a lot of magic
number checking code didn't need to change. If the stable marker could
be placed later (e.g. after the PEP 552 header), then we'd similarly
have the benefit that code checking the PEP 552 headers wouldn't need
to change, at the expense of folks having to read 20 bytes to see the
new signature byte (which shouldn't be a problem, given that file
defaults to reading up to 1 MiB from files it is trying to identify).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Paul Moore
On 11 April 2018 at 14:54, Chris Angelico  wrote:
> Sure, if you're just assigning zero to everything. But you could do
> that with a statement. What about this:
>
> q = {
> lambda: x := lambda y: z := a := 0,
> }
>
> Yes, it's an extreme example, but look at all those colons and tell me
> if you can figure out what each one is doing.

lambda: x := (lambda y: (z := (a := 0)))

As I say, it's the only *possible* parsing. It's ugly, and it
absolutely should be parenthesised, but there's no need to make the
parentheses mandatory. (And actually, it didn't take me long to add
those parentheses, it's not *hard* to parse correctly - for a human).

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Chris Angelico
On Wed, Apr 11, 2018 at 4:06 PM, Steven D'Aprano  wrote:
> On Wed, Apr 11, 2018 at 02:21:17PM +1000, Chris Angelico wrote:
>
> [...]
>> > Yes, it will double the number of files. Actually quadruple it, if the
>> > annotations and line numbers are in separate files too. But if most of
>> > those extra files never need to be opened, then there's no cost to them.
>> > And whatever extra cost there is, is amortized over the lifetime of the
>> > interpreter.
>>
>> Yes, if they are actually not needed. My question was about whether
>> that is truly valid.
>
> We're never really going to know the affect on performance without
> implementing and benchmarking the code. It might turn out that, to our
> surprise, three quarters of the std lib relies on loading docstrings
> during startup. But I doubt it.
>
>
>> Consider a very common use-case: an OS-provided
>> Python interpreter whose files are all owned by 'root'. Those will be
>> distributed with .pyc files for performance, but you don't want to
>> deprive the users of help() and anything else that needs docstrings
>> etc. So... are the docstrings lazily loaded or eagerly loaded?
>
> What relevance is that they're owned by root?

You have to predict in advance what you'll want to have in your pyc
files. Can't create them on the fly.

>> If eagerly, you've doubled the number of file-open calls to initialize
>> the interpreter.
>
> I do not understand why you think this is even an option. Has Serhiy
> said something that I missed that makes this seem to be on the table?
> That's not a rhetorical question -- I may have missed something. But I'm
> sure he understands that doubling or quadrupling the number of file
> operations during startup is not an optimization.
>
>
>> (Or quadrupled, if you need annotations and line
>> numbers and they're all separate.) If lazily, things are a lot more
>> complicated than the original description suggested, and there'd need
>> to be some semantic changes here.
>
> What semantic change do you expect?
>
> There's an implementation change, of course, but that's Serhiy's problem
> to deal with and I'm sure that he has considered that. There should be
> no semantic change. When you access obj.__doc__, then and only then are
> the compiled docstrings for that module read from the disk.

In other words, attempting to access obj.__doc__ can actually go and
open a file. Does it need to check if the file exists as part of the
import, or does it go back to sys.path? If the former, you're right
back with the eager loading problem of needing to do 2-4 times as many
stat calls; if the latter, it's semantically different in that a
change to sys.path can influence something that normally is preloaded.

> As for the in-memory data structures of objects themselves, I imagine
> something like the __doc__ and __annotation__ slots pointing to a table
> of strings, which is not initialised until you attempt to read from the
> table. Or something -- don't pay too much attention to my wild guesses.
>
> The bottom line is, is there some reason *aside from performance* to
> avoid this? Because if the performance is worse, I'm sure Serhiy will be
> the first to dump this idea.

Obviously it could be turned into just a performance question, but in
that case everything has to be preloaded, and I doubt there's going to
be any advantage. To be absolutely certain of retaining the existing
semantics, there'd need to be some sort of anchoring to ensure that
*this* .pyc file goes with *that* .pyc_docstrings file. Looking them
up anew will mean that there's every possibility that you get the
wrong file back.

As a simple example, upgrading your Python installation while you have
a Python script running can give you this effect already. Just import
a few modules, then change everything on disk. If you now import a
module that was already imported, you get it from cache (and the
unmodified version); import something that wasn't imported already,
and it goes to the disk. At the granularity of modules, this is seldom
a problem (I can imagine some package modules getting confused by
this, but otherwise not usually), but if docstrings are looked up
separately - and especially if lnotab is too - you could happily
import and use something (say, in a web server), then run updates, and
then an exception requires you to look up a line number. Oops, a few
lines got inserted into that file, and now all the line numbers are
straight-up wrong. That's a definite behavioural change. Maybe it's
one that's considered acceptable, but it definitely is a change. And
if mutations to sys.path can do this, it's definitely a semantic
change in Python.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Wed, Apr 11, 2018 at 11:37 PM, Paul Moore  wrote:
> On 11 April 2018 at 14:25, Chris Angelico  wrote:
>> On Wed, Apr 11, 2018 at 10:23 PM, Clint Hepner  
>> wrote:
 Differences from regular assignment statements
 --

 An assignment statement can assign to multiple targets::

x = y = z = 0

 To do the same with assignment expressions, they must be parenthesized::

assert 0 == (x := (y := (z := 0)))
>>>
>>> There's no rationale given for why this must be parenthesized.
>>> If := were right-associative,
>>>
>>> assert 0 == (x := y := z := 0)
>>>
>>> would work fine. (With high enough precedence, the remaining parentheses
>>> could be dropped, but one would probably keep them for clarity.)
>>> I think you need to spell out its associativity and precedence in more 
>>> detail,
>>> and explain why the rationale for the choice made.
>>
>> It's partly because of other confusing possibilities, such as its use
>> inside, or capturing, a lambda function. I'm okay with certain forms
>> requiring parens.
>
> The only possible reading of
>
> x := y := z := 0
>
> is as
>
> x := (y := (z := 0))
>
> because an assignment expression isn't allowed on the LHS of :=. So
> requiring parentheses is unnecessary. In the case of an assignment
> statement, "assignment to multiple targets" is a special case, because
> assignment is a statement not an expression. But with assignment
> *expressions*, a := b := 0 is simply assigning the result of the
> expression b := 0 (which is 0) to a. No need for a special case - so
> enforced parentheses would *be* the special case.

Sure, if you're just assigning zero to everything. But you could do
that with a statement. What about this:

q = {
lambda: x := lambda y: z := a := 0,
}

Yes, it's an extreme example, but look at all those colons and tell me
if you can figure out what each one is doing.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Wed, Apr 11, 2018 at 11:03 PM, Kirill Balunov
 wrote:
> Great work Chris! Thank you!
>
> I do not know whether this is good or bad, but this PEP considers so many
> different topics, although closely interrelated with each other.
>
> 2018-04-11 8:32 GMT+03:00 Chris Angelico :
>>
>>
>> Alterations to comprehensions
>> -
>>
>> The current behaviour of list/set/dict comprehensions and generator
>> expressions has some edge cases that would behave strangely if an
>> assignment
>> expression were to be used. Therefore the proposed semantics are changed,
>> removing the current edge cases, and instead altering their behaviour
>> *only*
>> in a class scope.
>>
>> As of Python 3.7, the outermost iterable of any comprehension is evaluated
>> in the surrounding context, and then passed as an argument to the implicit
>> function that evaluates the comprehension.
>>
>> Under this proposal, the entire body of the comprehension is evaluated in
>> its implicit function. Names not assigned to within the comprehension are
>> located in the surrounding scopes, as with normal lookups. As one special
>> case, a comprehension at class scope will **eagerly bind** any name which
>> is already defined in the class scope.
>>
>
> I think this change is important one no matter what will be the future of
> the current PEP. And since it breaks backward compatibility it deserves a
> separate PEP.

Well, it was Guido himself who started the sub-thread about classes
and comprehensions :)

To be honest, the changes to comprehensions are mostly going to be
under-the-hood tweaks. The only way you'll ever actually witness the
changes are if you:

1) Use assignment expressions inside comprehensions (ie using both
halves of this PEP); or

2) Put comprehensions at class scope (not inside methods, but actually
at class scope), referring to other names from class scope, in places
other than in the outermost iterable

3) Use 'yield' expressions in the outermost iterable of a list
comprehension inside a generator function

4) Create a generator expression that refers to an external name, then
change what that name is bound to before pumping the generator;
depending on the one open question, this may occur ONLY if this
external name is located at class scope.

5) Use generator expressions without iterating over them, in
situations where iterating might fail (again, depends on the one open
question).

Aside from the first possibility, these are extremely narrow edge and
corner cases, and the new behaviour is generally the more intuitive
anyway. Class scope stops being so incredibly magical that it's
completely ignored, and now becomes mildly magical such that name
lookups are resolved eagerly instead of lazily; and the outermost
iterable stops being magical in that it defies the weirdness of class
scope and the precise definitions of generator functions. Special
cases are being removed, not added.

>> Open questions
>> ==
>>
>> Can the outermost iterable still be evaluated early?
>> 
>>
>
> Previously, there was an alternative _operator form_  `->`  proposed by
> Steven D'Aprano. This option is no longer considered? I see several
> advantages with this variant:
> 1. It does not use `:` symbol which is very visually overloaded in Python.
> 2. It is clearly distinguishable from the usual assignment statement and
> it's `+=` friends
> There are others but they are minor.

I'm not sure why you posted this in response to the open question, but
whatever. The arrow operator is already a token in Python (due to its
use in 'def' statements) and should not conflict with anything;
however, apart from "it looks different", it doesn't have much to
speak for it. The arrow faces the other way in languages like Haskell,
but we can't use "<-" in Python due to conflicts with "<" and "-" as
independent operators.

>> This could be used to create ugly code!
>> ---
>>
>> So can anything else.  This is a tool, and it is up to the programmer to
>> use it
>> where it makes sense, and not use it where superior constructs can be
>> used.
>>
>
> But the ugly code matters, especially when it comes to Python. For me, the
> ideal option would be the combination of two rejected parts:
>
> (+ in `while`) combined with this part:
>
>
>> 3. ``with EXPR as NAME``::
>>
>>stuff = [(y, x/y) with f(x) as y for x in range(5)]
>>
>>As per option 2, but using ``as`` rather than an equals sign. Aligns
>>syntactically with other uses of ``as`` for name binding, but a simple
>>transformation to for-loop longhand would create drastically different
>>semantics; the meaning of ``with`` inside a comprehension would be
>>completely different from the meaning as a stand-alone statement, while
>>retaining identical syntax.
>
>
> I see no benefit to have the assignment expression in other places. 

Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Erik Bray
On Tue, Apr 10, 2018 at 9:50 PM, Eric V. Smith  wrote:
>
>>> 3. Annotations. They are used mainly by third party tools that
>>> statically analyze sources. They are rarely used at runtime.
>>
>> Even less used than docstrings probably.
>
> typing.NamedTuple and dataclasses use annotations at runtime.

Astropy uses annotations at runtime for optional unit checking on
arguments that take dimensionful quantities:
http://docs.astropy.org/en/stable/api/astropy.units.quantity_input.html#astropy.units.quantity_input
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Paul Moore
On 11 April 2018 at 14:25, Chris Angelico  wrote:
> On Wed, Apr 11, 2018 at 10:23 PM, Clint Hepner  wrote:
>>> Differences from regular assignment statements
>>> --
>>>
>>> An assignment statement can assign to multiple targets::
>>>
>>>x = y = z = 0
>>>
>>> To do the same with assignment expressions, they must be parenthesized::
>>>
>>>assert 0 == (x := (y := (z := 0)))
>>
>> There's no rationale given for why this must be parenthesized.
>> If := were right-associative,
>>
>> assert 0 == (x := y := z := 0)
>>
>> would work fine. (With high enough precedence, the remaining parentheses
>> could be dropped, but one would probably keep them for clarity.)
>> I think you need to spell out its associativity and precedence in more 
>> detail,
>> and explain why the rationale for the choice made.
>
> It's partly because of other confusing possibilities, such as its use
> inside, or capturing, a lambda function. I'm okay with certain forms
> requiring parens.

The only possible reading of

x := y := z := 0

is as

x := (y := (z := 0))

because an assignment expression isn't allowed on the LHS of :=. So
requiring parentheses is unnecessary. In the case of an assignment
statement, "assignment to multiple targets" is a special case, because
assignment is a statement not an expression. But with assignment
*expressions*, a := b := 0 is simply assigning the result of the
expression b := 0 (which is 0) to a. No need for a special case - so
enforced parentheses would *be* the special case.

And you can't really argue that they are needed "for clarity" at the
same time as having your comments about how "being able to write ugly
code" isn't a valid objection :-)

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Wed, Apr 11, 2018 at 10:23 PM, Clint Hepner  wrote:
>> On 2018 Apr 11 , at 1:32 a, Chris Angelico  wrote:
>># Similar to the boolean 'or' but checking for None specifically
>>x = "default" if (eggs := spam().ham) is None else eggs
>
>>
>># Even complex expressions can be built up piece by piece
>>y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])
>
> These would be clearer if you could remove the assignment from the expression 
> itself.
> Assuming "let" were available as a keyword,
>
> x = (let eggs = spam().ham
>  in
>  "default" if eggs is None else eggs)
> y = (let eggs = spam(),
>  cheese = eggs.method()
>  in
>  (eggs, cheese, cheese[eggs]))
>
> Allowing for differences in how best to format such an expression, the final
> expression is clearly separate from its component assignment. (More on this
> in the Alternative Spellings section below.)

I have no idea what the "in" keyword is doing here, but somehow it
isn't being used for the meaning it currently has in Python. Does your
alternative require not one but *two* new keywords?

>> Differences from regular assignment statements
>> --
>>
>> An assignment statement can assign to multiple targets::
>>
>>x = y = z = 0
>>
>> To do the same with assignment expressions, they must be parenthesized::
>>
>>assert 0 == (x := (y := (z := 0)))
>
> There's no rationale given for why this must be parenthesized.
> If := were right-associative,
>
> assert 0 == (x := y := z := 0)
>
> would work fine. (With high enough precedence, the remaining parentheses
> could be dropped, but one would probably keep them for clarity.)
> I think you need to spell out its associativity and precedence in more detail,
> and explain why the rationale for the choice made.

It's partly because of other confusing possibilities, such as its use
inside, or capturing, a lambda function. I'm okay with certain forms
requiring parens.

>> Augmented assignment is not supported in expression form::
>>
> x +:= 1
>>  File "", line 1
>>x +:= 1
>>^
>>SyntaxError: invalid syntax
>
> There's no reason give for why this is invalid. I assume it's a combination
> of 1) Having both += and +:=/:+= would be redundant and 2) not wanting
> to add 11+ new operators to the language.

And 3) there's no point. Can you give an example of where you would
want an expression form of augmented assignment?

> 4. Adding a ``let`` expression to create local bindings
>
> value = let x = spam(1, 4, 7, q) in x**2 + 2*x
>
> 5. Adding a ``where`` expression to create local bindings:
>
> value = x**2 + 2*x where x = spam(1, 4, 7, q)
>
> Both have the extra-keyword problem. Multiple bindings are little harder
> to add than they would be with the ``where:`` modifier, although
> a few extra parentheses and judicious line breaks make it not so bad to
> allow a comma-separated list, as shown in my first example at the top of
> this reply.

Both also have the problem of "exactly how local ARE these bindings?",
and the 'let' example either requires two new keywords, or requires
repurposing 'in' to mean something completely different from its usual
'item in collection' boolean check. The 'where' example is broadly
similar to rejected alternative 3, except that you're removing the
colon and the suite, which means you can't create more than one
variable without figuring some way to parenthesize. Let's suppose this
were defined as:

EXPR where NAME = EXPR

as a five-component sequence. If you were to write this twice

EXPR where NAME = EXPR where OTHERNAME = EXPR

then it could just as logically be defined as "EXPR where NAME = (EXPR
where OTHERNAME = EXPR)" as the other way. And even if it were to work
as "(EXPR where NAME = EXPR) where OTHERNAME = EXPR", that still has
the highly confusing semantics of being evaluated right-to-left.

(Before you ask: no, you can't define it as "EXPR where NAME = EXPR ,
NAME = EXPR", because that would require looking a long way forward.)

>> Special-casing conditional statements
>> -
>>
> 4. `` let NAME = EXPR1  in EXPR2``::
>
> stuff = [let y = f(x) in (y, x/y) for x in range(5)]
>
> I don't have anything new to say about this. It has the same keyword
> objections as similar proposals, and I think I've addressed the use case
> elsewhere.

This section is specifically about proposals that ONLY solve this
problem within list comprehensions. I don't think there's any point
mentioning your proposal there, as the "let NAME = EXPR in EXPR"
notation has nothing particularly to do with comprehensions.

>> With assignment expressions, why bother with assignment statements?
>> ---
>>
>> The two forms have different flexibilities.  The ``:=`` operator can be used
>> inside a larger 

Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Paul Moore
On 11 April 2018 at 13:23, Clint Hepner  wrote:
>># Even complex expressions can be built up piece by piece
>>y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])

> I find the assignments make it difficult to pick out what the final 
> expression looks like.
> The first isn't too bad, but it took me a moment to figure out what y was. 
> Quick: is it
>
>   * (a, b, c)
>   * (a, (b, c))
>   * ((a, b), c)
>   * something else
>
> First I though it was (a, b, c), then I thought it was actually ((a, b), c), 
> before
> carefully counting the parentheses showed that I was right the first time.

This is a reasonable concern, IMO. But it comes solidly under the
frequently raised objection "This could be used to create ugly code!".
Writing it as

y = (
(eggs := spam()),
(cheese := eggs.method()),
cheese[eggs]
)

makes it obvious what the structure is.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Kirill Balunov
Great work Chris! Thank you!

I do not know whether this is good or bad, but this PEP considers so many
different topics, although closely interrelated with each other.

2018-04-11 8:32 GMT+03:00 Chris Angelico :

>
> Alterations to comprehensions
> -
>
> The current behaviour of list/set/dict comprehensions and generator
> expressions has some edge cases that would behave strangely if an
> assignment
> expression were to be used. Therefore the proposed semantics are changed,
> removing the current edge cases, and instead altering their behaviour
> *only*
> in a class scope.
>
> As of Python 3.7, the outermost iterable of any comprehension is evaluated
> in the surrounding context, and then passed as an argument to the implicit
> function that evaluates the comprehension.
>
> Under this proposal, the entire body of the comprehension is evaluated in
> its implicit function. Names not assigned to within the comprehension are
> located in the surrounding scopes, as with normal lookups. As one special
> case, a comprehension at class scope will **eagerly bind** any name which
> is already defined in the class scope.
>
>
I think this change is important one no matter what will be the future of
the current PEP. And since it breaks backward compatibility it deserves a
separate PEP.


> Open questions
> ==
>
> Can the outermost iterable still be evaluated early?
> 
>
> As of Python 3.7, the outermost iterable in a genexp is evaluated early,
> and
> the result passed to the implicit function as an argument.  With PEP 572,
> this
> would no longer be the case. Can we still, somehow, evaluate it before
> moving
> on? One possible implementation would be::
>
> gen = (x for x in rage(10))
> # translates to
> def ():
> iterable = iter(rage(10))
> yield None
> for x in iterable:
> yield x
> gen = ()
> next(gen)
>
> This would pump the iterable up to just before the loop starts, evaluating
> exactly as much as is evaluated outside the generator function in Py3.7.
> This would result in it being possible to call ``gen.send()`` immediately,
> unlike with most generators, and may incur unnecessary overhead in the
> common case where the iterable is pumped immediately (perhaps as part of a
> larger expression).
>
>
Previously, there was an alternative _operator form_  `->`  proposed by
Steven D'Aprano. This option is no longer considered? I see several
advantages with this variant:
1. It does not use `:` symbol which is very visually overloaded in Python.
2. It is clearly distinguishable from the usual assignment statement and
it's `+=` friends
There are others but they are minor.


> Frequently Raised Objections
> 
>
> Why not just turn existing assignment into an expression?
> -
>
> C and its derivatives define the ``=`` operator as an expression, rather
> than
> a statement as is Python's way.  This allows assignments in more contexts,
> including contexts where comparisons are more common.  The syntactic
> similarity
> between ``if (x == y)`` and ``if (x = y)`` belies their drastically
> different
> semantics.  Thus this proposal uses ``:=`` to clarify the distinction.
>
>
> This could be used to create ugly code!
> ---
>
> So can anything else.  This is a tool, and it is up to the programmer to
> use it
> where it makes sense, and not use it where superior constructs can be used.
>
>
But the ugly code matters, especially when it comes to Python. For me, the
ideal option would be the combination of two rejected parts:

Special-casing conditional statements
> -
>
> One of the most popular use-cases is ``if`` and ``while`` statements.
> Instead
> of a more general solution, this proposal enhances the syntax of these two
> statements to add a means of capturing the compared value::
>
> if re.search(pat, text) as match:
> print("Found:", match.group(0))
>
> This works beautifully if and ONLY if the desired condition is based on the
> truthiness of the captured value.  It is thus effective for specific
> use-cases (regex matches, socket reads that return `''` when done), and
> completely useless in more complicated cases (eg where the condition is
> ``f(x) < 0`` and you want to capture the value of ``f(x)``).  It also has
> no benefit to list comprehensions.
>
> Advantages: No syntactic ambiguities. Disadvantages: Answers only a
> fraction
> of possible use-cases, even in ``if``/``while`` statements.
>


(+ in `while`) combined with this part:


3. ``with EXPR as NAME``::
>
>stuff = [(y, x/y) with f(x) as y for x in range(5)]
>
>As per option 2, but using ``as`` rather than an equals sign. Aligns
>syntactically with other uses of ``as`` for name binding, but a simple
>

Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Clint Hepner

> On 2018 Apr 11 , at 1:32 a, Chris Angelico  wrote:
> 
> Wholesale changes since the previous version. Statement-local name
> bindings have been dropped (I'm still keeping the idea in the back of
> my head; this PEP wasn't the first time I'd raised the concept), and
> we're now focusing primarily on assignment expressions, but also with
> consequent changes to comprehensions.

Overall, I'm slightly negative on this. I think named expressions will
be a good thing to have, but not in this form. I'll say up front that,
being fully aware of the issues surrounding the introduction of a new
keyword, something like a let expression in Haskell would be more readable
than embedded assignments in most cases.

In the end, I suspect my `let` proposal is a nonstarter and just useful
to list with the rest of the rejected alternatives, but I wanted.

   
> 
> Abstract
> 
> 

[...]

> 
> 
> Rationale
> =
> 

[...]

> 
> Syntax and semantics
> 
> 
> In any context where arbitrary Python expressions can be used, a **named
> expression** can appear. This can be parenthesized for clarity, and is of
> the form ``(target := expr)`` where ``expr`` is any valid Python expression,
> and ``target`` is any valid assignment target.
> 
> The value of such a named expression is the same as the incorporated
> expression, with the additional side-effect that the target is assigned
> that value.
> 
># Similar to the boolean 'or' but checking for None specifically
>x = "default" if (eggs := spam().ham) is None else eggs

> 
># Even complex expressions can be built up piece by piece
>y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])

I find the assignments make it difficult to pick out what the final expression 
looks like.
The first isn't too bad, but it took me a moment to figure out what y was. 
Quick: is it

  * (a, b, c)
  * (a, (b, c))
  * ((a, b), c)
  * something else

First I though it was (a, b, c), then I thought it was actually ((a, b), c), 
before
carefully counting the parentheses showed that I was right the first time.

These would be clearer if you could remove the assignment from the expression 
itself.
Assuming "let" were available as a keyword,

x = (let eggs = spam().ham
 in
 "default" if eggs is None else eggs)
y = (let eggs = spam(),
 cheese = eggs.method()
 in
 (eggs, cheese, cheese[eggs]))

Allowing for differences in how best to format such an expression, the final
expression is clearly separate from its component assignment. (More on this
in the Alternative Spellings section below.)

> 
> Differences from regular assignment statements
> --
> 
> An assignment statement can assign to multiple targets::
> 
>x = y = z = 0
> 
> To do the same with assignment expressions, they must be parenthesized::
> 
>assert 0 == (x := (y := (z := 0)))

There's no rationale given for why this must be parenthesized. 
If := were right-associative,

assert 0 == (x := y := z := 0)

would work fine. (With high enough precedence, the remaining parentheses
could be dropped, but one would probably keep them for clarity.)
I think you need to spell out its associativity and precedence in more detail,
and explain why the rationale for the choice made.

> 
> Augmented assignment is not supported in expression form::
> 
 x +:= 1
>  File "", line 1
>x +:= 1
>^
>SyntaxError: invalid syntax

There's no reason give for why this is invalid. I assume it's a combination
of 1) Having both += and +:=/:+= would be redundant and 2) not wanting
to add 11+ new operators to the language.

> 
> Otherwise, the semantics of assignment are unchanged by this proposal.
> 

[List comprehensions deleted]

> 
> 
> Recommended use-cases
> =
> 
> Simplifying list comprehensions
> ---
> 
> These list comprehensions are all approximately equivalent::

[existing alternatives redacted]

># Using a temporary name
>stuff = [[y := f(x), x/y] for x in range(5)]

Again, this would be clearer if the assignment were separated from the 
expression where it
would be used.

stuff = [let y = f(x) in [y, x/y] for x in range(5)]


> 
> Capturing condition values
> --
> 
> Assignment expressions can be used to good effect in the header of
> an ``if`` or ``while`` statement::
> 
># Current Python, not caring about function return value
>while input("> ") != "quit":
>print("You entered a command.")
> 
># Current Python, capturing return value - four-line loop header
>while True:
>command = input("> ");
>if command == "quit":
>break
>print("You entered:", command)
> 
># Proposed alternative to the above
>while (command := input("> ")) != "quit":
>print("You entered:", command)
> 
># Capturing regular 

Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Paul Moore
On 11 April 2018 at 10:30, Chris Angelico  wrote:
> The PEP has kinda pivoted a bit since its inception, so I'm honestly
> not sure what "original motivating use case" matters. :D I'm just
> lumping all the use-cases together at the same priority now.

Fair point, and reading this PEP in isolation the comprehension use
case really isn't that prominent. So yes, I guess you're right.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Wed, Apr 11, 2018 at 6:55 PM, Paul Moore  wrote:
> On 11 April 2018 at 06:32, Chris Angelico  wrote:
>> The name ``prefix`` is thus searched for at global scope, ignoring the class
>> name. Under the proposed semantics, this name will be eagerly bound, being
>> approximately equivalent to::
>>
>> class X:
>> names = ["Fred", "Barney", "Joe"]
>> prefix = "> "
>> def (prefix=prefix):
>> result = []
>> for name in names:
>> result.append(prefix + name)
>> return result
>> prefixed_names = ()
>
> Surely "names" would also be eagerly bound, for use in the "for" loop?

Yep, exactly. Have corrected the example, thanks.

>> This could be used to create ugly code!
>> ---
>>
>> So can anything else.  This is a tool, and it is up to the programmer to use 
>> it
>> where it makes sense, and not use it where superior constructs can be used.
>
> Related objection - when used to name subexpressions in a
> comprehension (one of the original motivating use cases for this
> proposal), this introduces an asymmetry which actually makes the
> comprehension harder to read. As a result, it's quite possible that
> people won't want to use assignment expressions in this case, and the
> use case of precalculating expensive but multiply used results in
> comprehensions will remain unanswered.
>
> I think the response here is basically the same as the above - if you
> don't like them, don't use them. But I do think the additional nuance
> of "we might not have solved the original motivating use case" is
> worth a specific response.

The PEP has kinda pivoted a bit since its inception, so I'm honestly
not sure what "original motivating use case" matters. :D I'm just
lumping all the use-cases together at the same priority now.

> Overall, I like this much better than the previous proposal. I'm now
> +1 on the semantic changes to comprehensions, and barely +0 on the
> assignment expression itself (I still don't think assignment
> expressions are worth it, and I worry about the confusion they may
> cause for beginners in particular).

Now that they have the same semantics as any other form of assignment,
they're a bit less useful in some cases, a bit more useful in others,
and a lot easier to explain. The most confusing part, honestly, is
"why do we have two ways to do assignment", which is why that is
specifically answered in the PEP.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Petr Viktorin



On 04/11/18 08:06, Steven D'Aprano wrote:

On Wed, Apr 11, 2018 at 02:21:17PM +1000, Chris Angelico wrote:

[...]

Yes, it will double the number of files. Actually quadruple it, if the
annotations and line numbers are in separate files too. But if most of
those extra files never need to be opened, then there's no cost to them.
And whatever extra cost there is, is amortized over the lifetime of the
interpreter.


Yes, if they are actually not needed. My question was about whether
that is truly valid.


We're never really going to know the affect on performance without
implementing and benchmarking the code. It might turn out that, to our
surprise, three quarters of the std lib relies on loading docstrings
during startup. But I doubt it.



Consider a very common use-case: an OS-provided
Python interpreter whose files are all owned by 'root'. Those will be
distributed with .pyc files for performance, but you don't want to
deprive the users of help() and anything else that needs docstrings
etc. So... are the docstrings lazily loaded or eagerly loaded?


What relevance is that they're owned by root?



If eagerly, you've doubled the number of file-open calls to initialize
the interpreter.


I do not understand why you think this is even an option. Has Serhiy
said something that I missed that makes this seem to be on the table?
That's not a rhetorical question -- I may have missed something. But I'm
sure he understands that doubling or quadrupling the number of file
operations during startup is not an optimization.



(Or quadrupled, if you need annotations and line
numbers and they're all separate.) If lazily, things are a lot more
complicated than the original description suggested, and there'd need
to be some semantic changes here.


What semantic change do you expect?

There's an implementation change, of course, but that's Serhiy's problem
to deal with and I'm sure that he has considered that. There should be
no semantic change. When you access obj.__doc__, then and only then are
the compiled docstrings for that module read from the disk.

I don't know the current implementation of .pyc files, but I like
Antoine's suggestion of laying it out in four separate areas (plus
header), each one marshalled:

 code
 docstrings
 annotations
 line numbers

Aside from code, which is mandatory, the three other sections could be
None to represent "not available", as is the case when you pass -00 to
the interpreter, or they could be some other sentinel that means "load
lazily from the appropriate file", or they could be the marshalled data
directly in place to support byte-code only libraries.

As for the in-memory data structures of objects themselves, I imagine
something like the __doc__ and __annotation__ slots pointing to a table
of strings, which is not initialised until you attempt to read from the
table. Or something -- don't pay too much attention to my wild guesses.


A __doc__ sentinel could even say something like "bytes 350--420 in the 
original .py file, as UTF-8".




The bottom line is, is there some reason *aside from performance* to
avoid this? Because if the performance is worse, I'm sure Serhiy will be
the first to dump this idea.



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Petr Viktorin

On 04/11/18 06:21, Chris Angelico wrote:

On Wed, Apr 11, 2018 at 1:02 PM, Steven D'Aprano  wrote:

On Wed, Apr 11, 2018 at 10:08:58AM +1000, Chris Angelico wrote:


File system limits aren't usually an issue; as you say, even FAT32 can
store a metric ton of files in a single directory. I'm more interested
in how long it takes to open a file, and whether doubling that time
will have a measurable impact on Python startup time. Part of that
cost can be reduced by using openat(), on platforms that support it,
but even with a directory handle, there's still a definite non-zero
cost to opening and reading an additional file.


Yes, it will double the number of files. Actually quadruple it, if the
annotations and line numbers are in separate files too. But if most of
those extra files never need to be opened, then there's no cost to them.
And whatever extra cost there is, is amortized over the lifetime of the
interpreter.


Yes, if they are actually not needed. My question was about whether
that is truly valid. Consider a very common use-case: an OS-provided
Python interpreter whose files are all owned by 'root'. Those will be
distributed with .pyc files for performance, but you don't want to
deprive the users of help() and anything else that needs docstrings
etc.


Currently in Fedora, we ship *both* optimized and non-optimized pycs to 
make sure both -O and non--O will work nicely without root privilieges. 
So splitting the docstrings into a separate file would be, for us, a 
benefit in terms of file size.




So... are the docstrings lazily loaded or eagerly loaded? If
eagerly, you've doubled the number of file-open calls to initialize
the interpreter. (Or quadrupled, if you need annotations and line
numbers and they're all separate.) If lazily, things are a lot more
complicated than the original description suggested, and there'd need
to be some semantic changes here.


Serhiy is experienced enough that I think we should assume he's not
going to push this optimization into production unless it actually does
reduce startup time. He has proven himself enough that we should assume
competence rather than incompetence :-)


Oh, I'm definitely assuming that he knows what he's doing :-) Doesn't
mean I can't ask the question though.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin

2018-04-11 Thread Paul Moore
On 11 April 2018 at 04:41, Steven D'Aprano  wrote:
>> > But in a way that more intuitively expresses the intent of the code, it
>> > would be great to have more options on the market.
>>
>> It's worth adding a reminder here that "having more options on the
>> market" is pretty directly in contradiction to the Zen of Python -
>> "There should be one-- and preferably only one --obvious way to do
>> it".
>
> I'm afraid I'm going to (mildly) object here. At least you didn't
> misquote the Zen as "Only One Way To Do It" :-)
>
> The Zen here is not a prohibition against there being multiple ways to
> do something -- how could it, given that Python is a general purpose
> programming language there is always going to be multiple ways to write
> any piece of code? Rather, it exhorts us to make sure that there are one
> or more ways to "do it", at least one of which is obvious.

I apologise if I came across as implying that I thought the Zen said
that having multiple ways was prohibited. I don't (and certainly the
Zen doesn't mean that). Rather, I was saying that using "it gives us
an additional way to do something" is a bad argument in favour of a
proposal for Python. At a minimum, the proposal needs to argue why the
new feature is "more obvious" than the existing ways (bonus points if
the proposer is Dutch - see the following Zen item ;-)), or why it
offers a capability that isn't possible with the existing language.
And I'm not even saying that the OP hasn't attempted to make such
arguments (even if I disagree with them). All I was pointing out was
that the comment "it would be great to have more options on the
market" implies a misunderstanding of the design goals of Python
(hence my "reminder" of the principle I think is relevant here).

Sorry again if that's not what it sounded like.
Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts

2018-04-11 Thread Mike Miller
Ok, we can haggle the finer details and I admit once you learn the syntax it 
isn't substantially harder.  Simply, I've found the dict() a bit easier to 
mentally parse at a glance.  Also, to add I've always expected multiple args to 
work with it, and am always surprised when it doesn't.


Would never have thought of this unpacking syntax if I didn't know that's the 
way its done now, but often have to think about it for a second or two.



On 2018-04-10 22:22, Chris Angelico wrote:

On Wed, Apr 11, 2018 at 2:44 PM, Steven D'Aprano  wrote:

On Wed, Apr 11, 2018 at 02:22:08PM +1000, Chris Angelico wrote:


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Steve Barnes


On 10/04/2018 18:54, Zachary Ware wrote:
> On Tue, Apr 10, 2018 at 12:38 PM, Chris Angelico  wrote:
>> A deployed Python distribution generally has .pyc files for all of the
>> standard library. I don't think people want to lose the ability to
>> call help(), and unless I'm misunderstanding, that requires
>> docstrings. So this will mean twice as many files and twice as many
>> file-open calls to import from the standard library. What will be the
>> impact on startup time?
> 
> What about instead of separate files turning the single file into a
> pseudo-zip file containing all of the proposed files, and provide a
> simple tool for removing whatever parts you don't want?
> 

Personally I quite like the idea of having the doc strings, and possibly 
other optional components, in a zipped section after a marker for the 
end of the operational code. Possibly the loader could stop reading at 
that point, (reducing load time and memory impact), and only load and 
unzip on demand.

Zipping the doc strings should have a significant reduction in file 
sizes but it is worth remembering a couple of things:

  - Python is already one of the most compact languages for what it can 
do - I have had experts demanding to know where the rest of the program 
is hidden and how it is being downloaded when they noticed the size of 
the installed code verses the functionality provided.
  - File size <> disk space consumed - on most file systems each file 
typically occupies 1 + (file_size // allocation_size) clusters of the 
drive and with increasing disk sizes generally the allocation_size is 
increasing both of my NTFS drives currently have 4096 byte allocation 
sizes but I am offered up to 2 MB allocation sizes - splitting a .pyc 
10,052 byte .pyc file, (picking a random example from my drive) into a 
5,052 and 5,000 byte files will change the disk space occupied  from 
3*4,096 to 4*4,096 plus the extra directory entry.
  - Where absolute file size is critical you, (such as embedded 
systems), can always use the -O & -OO flags.
-- 
Steve (Gadget) Barnes
Any opinions in this message are my personal opinions and do not reflect 
those of my employer.

---
This email has been checked for viruses by AVG.
http://www.avg.com

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 572: Assignment Expressions (post #4)

2018-04-11 Thread Chris Angelico
On Wed, Apr 11, 2018 at 3:54 PM, Ethan Furman  wrote:
> On 04/10/2018 10:32 PM, Chris Angelico wrote:
>
>> Migration path
>> ==
>>
>> The semantic changes to list/set/dict comprehensions, and more so to
>> generator
>> expressions, may potentially require migration of code. In many cases, the
>> changes simply make legal what used to raise an exception, but there are
>> some
>> edge cases that were previously legal and are not, and a few corner cases
>> with
>> altered semantics.
>
>
> s/previously legal and are not/previously legal and now are not/
>

Trivial change, easy fix. Thanks.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Python-ideas Digest, Vol 137, Issue 40

2018-04-11 Thread Thautwarm Zhao
I think Guido has given a direct answer why dict unpacking is not supported
in syntax level,
I can take it and I think it's better to implement a function for dict
unpacking in standard library, just like

from dict_unpack import dict_unpack, pattern as pat
some_dict = {'a': {'b': {'c': 1}, 'd':2}, 'e': 3}

extracted = dict_unpack(some_dict, schema = {'a': {'b': {'c':
pat('V1')}, 'd': pat('V2')}, 'e': pat('V3')})
# extract to a flatten dictionary

v1, v2, v3 = (extracted[k] for k in ('V1', 'V2', 'V3'))
assert (v1, v2, v3) == (1, 2, 3)


As for Steve's confusion,

> > {key: value_pattern, **_} = {key: value, **_}

> If I saw that, I would have no idea what it could even possibly do.
> Let's pick the simplest concrete example I can think of:
>
> {'A': 1, **{}} = {'A': 0, **{}}
>
> I cannot interpret what that should do. Is it some sort of
> pattern-matching? An update? What is the result? It is obviously some
> sort of binding operation, an assignment, but an assignment to what?

{'A': 1, **{}} = {'A': 0, **{}} should be just wrong because for any k-v
pair at LHS, the key should be a expression and the value is for unpacking.
{'A': [*a, b]} = {'A': [1, 2,  3]} is welcome, but {'A': 1} = {'A': '1'} is
also something like pattern matching which is out of our topic.

Anyway, this feature will not come true, let's forget it...


I think Jacco is totally correct in following words.

> I think most of these problems could be solved with pop and the
> occasional list comprehension like this:
>
> a, b, c = [{'a':1,'b':2,'c':3}.pop(key) for key in ('a', 'b', 'c')]
>
> or for your example:
>
> c =  {'a': 1, **{'b': 2}}  # I suppose this one would generally
>  # be dynamic, but I need a name here.
> a, b = [c.pop(key) for key in ('a', 'b')]
>
> would extract all the keys you need, and has the advantage that
> you don't need hardcoded dict structure if you expand it to nested
> dicts. It's even less writing, and just as extensible to nested dicts.
> And if you dont actually want to destruct (tuples and lists aren't
> destroyed either), just use __getitem__ access instead of pop.

But pop cannot work for a nested case.

Feel free to end this topic.

thautwarm



2018-04-10 23:20 GMT+08:00 :

> Send Python-ideas mailing list submissions to
> python-ideas@python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mail.python.org/mailman/listinfo/python-ideas
> or, via email, send a message with subject or body 'help' to
> python-ideas-requ...@python.org
>
> You can reach the person managing the list at
> python-ideas-ow...@python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-ideas digest..."
>
> Today's Topics:
>
>1. Re: Is there any idea about dictionary destructing?
>   (Steven D'Aprano)
>2. Re: Is there any idea about dictionary destructing?
>   (Jacco van Dorp)
>3. Re: Start argument for itertools.accumulate() [Was: Proposal:
>   A Reduce-Map Comprehension and a "last" builtin] (Guido van Rossum)
>4. Re: Is there any idea about dictionary destructing?
>   (Guido van Rossum)
>
>
> -- 已转发邮件 --
> From: "Steven D'Aprano" 
> To: python-ideas@python.org
> Cc:
> Bcc:
> Date: Tue, 10 Apr 2018 19:21:35 +1000
> Subject: Re: [Python-ideas] Is there any idea about dictionary destructing?
> On Tue, Apr 10, 2018 at 03:29:08PM +0800, Thautwarm Zhao wrote:
>
> > I'm focused on the consistency of the language itself.
>
> Consistency is good, but it is not the only factor to consider. We must
> guard against *foolish* consistency: adding features just for the sake
> of matching some other, often barely related, feature. Each feature must
> justify itself, and consistency with something else is merely one
> possible attempt at justification.
>
>
> > {key: value_pattern, **_} = {key: value, **_}
>
> If I saw that, I would have no idea what it could even possibly do.
> Let's pick the simplest concrete example I can think of:
>
> {'A': 1, **{}} = {'A': 0, **{}}
>
> I cannot interpret what that should do. Is it some sort of
> pattern-matching? An update? What is the result? It is obviously some
> sort of binding operation, an assignment, but an assignment to what?
>
> Sequence binding and unpacking was obvious the first time I saw it. I
> had no problem guessing what:
>
> a, b, c = 1, 2, 3
>
> meant, and once I had seen that, it wasn't hard to guess what
>
> a, b, c = *sequence
>
> meant. From there it is easy to predict extended unpacking. But I can't
> say the same for this.
>
> I can almost see the point of:
>
> a, b, c, = **{'a': 1, 'b': 2, 'c': 3}
>
> but I'm having trouble thinking of a situation where I would actually
> use it. But your syntax above just confuses me.
>
>
> > The reason why it's important is that, when 

Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Steven D'Aprano
On Wed, Apr 11, 2018 at 02:21:17PM +1000, Chris Angelico wrote:

[...]
> > Yes, it will double the number of files. Actually quadruple it, if the
> > annotations and line numbers are in separate files too. But if most of
> > those extra files never need to be opened, then there's no cost to them.
> > And whatever extra cost there is, is amortized over the lifetime of the
> > interpreter.
> 
> Yes, if they are actually not needed. My question was about whether
> that is truly valid.

We're never really going to know the affect on performance without 
implementing and benchmarking the code. It might turn out that, to our 
surprise, three quarters of the std lib relies on loading docstrings 
during startup. But I doubt it.


> Consider a very common use-case: an OS-provided
> Python interpreter whose files are all owned by 'root'. Those will be
> distributed with .pyc files for performance, but you don't want to
> deprive the users of help() and anything else that needs docstrings
> etc. So... are the docstrings lazily loaded or eagerly loaded?

What relevance is that they're owned by root?


> If eagerly, you've doubled the number of file-open calls to initialize
> the interpreter.

I do not understand why you think this is even an option. Has Serhiy 
said something that I missed that makes this seem to be on the table? 
That's not a rhetorical question -- I may have missed something. But I'm 
sure he understands that doubling or quadrupling the number of file 
operations during startup is not an optimization.


> (Or quadrupled, if you need annotations and line
> numbers and they're all separate.) If lazily, things are a lot more
> complicated than the original description suggested, and there'd need
> to be some semantic changes here.

What semantic change do you expect?

There's an implementation change, of course, but that's Serhiy's problem 
to deal with and I'm sure that he has considered that. There should be 
no semantic change. When you access obj.__doc__, then and only then are 
the compiled docstrings for that module read from the disk.

I don't know the current implementation of .pyc files, but I like 
Antoine's suggestion of laying it out in four separate areas (plus 
header), each one marshalled:

code
docstrings
annotations
line numbers

Aside from code, which is mandatory, the three other sections could be 
None to represent "not available", as is the case when you pass -00 to 
the interpreter, or they could be some other sentinel that means "load 
lazily from the appropriate file", or they could be the marshalled data 
directly in place to support byte-code only libraries.

As for the in-memory data structures of objects themselves, I imagine 
something like the __doc__ and __annotation__ slots pointing to a table 
of strings, which is not initialised until you attempt to read from the 
table. Or something -- don't pay too much attention to my wild guesses.

The bottom line is, is there some reason *aside from performance* to 
avoid this? Because if the performance is worse, I'm sure Serhiy will be 
the first to dump this idea.


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/