Re: [Python-Dev] Sets, Dictionaries

2018-03-28 Thread Hasan Diwan
Hi, Julia,

On 28 March 2018 at 21:14, Julia Kim  wrote:
>
> My suggestion is to change the syntax for creating an empty set and an
> empty dictionary as following.
>

You should craft your suggestion as a PEP and send it to the python-ideas
mailing list. Good luck! -- H

-- 
OpenPGP:
https://sks-keyservers.net/pks/lookup?op=get=0xFEBAD7FFD041BBA1
If you wish to request my time, please do so using
http://bit.ly/hd1ScheduleRequest.
Si vous voudrais faire connnaisance, allez a
http://bit.ly/hd1ScheduleRequest.

Sent
from my mobile device
Envoye de mon portable
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Sets, Dictionaries

2018-03-28 Thread Julia Kim
Hi,

My name is Julia Hiyeon Kim.

My suggestion is to change the syntax for creating an empty set and an empty 
dictionary as following.

an_empty_set = {}
an_empty_dictionary = {:}

It would seem to make more sense.


Warm regards,
Julia Kim
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data

2018-03-28 Thread Terry Reedy

On 3/28/2018 9:15 PM, Nathaniel Smith wrote:


There's obviously some tension here between pickle's use as a
persistent storage format, and its use as a transient wire format. For
the former, you definitely can't store code objects because there's no
forwards- or backwards-compatibility guarantee for bytecode. But for
the latter, transmitting bytecode is totally fine, because all you
care about is whether it can be decoded once, right now, by some peer
process whose python version you can control -- that's why cloudpickle
exists.


An interesting observation.  IDLE compiles user code in the user process 
to check for syntax errors.  idlelib.rpc subclasses Pickler to pickle 
the resulting code objects via marshal.dumps so it can send them to the 
user code execution subprocess.



Would it make sense to have a special pickle version that the
transient wire format users could opt into, that only promises
compatibility within a given 3.X release cycle? Like version=-2 or
version=pickle.NONPORTABLE or something?

(This is orthogonal to Antoine's PEP.)


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data

2018-03-28 Thread Robert Collins
One question..

On Thu., 29 Mar. 2018, 07:42 Antoine Pitrou,  wrote:

> ...
>

===
>
> Mutability
> --
>
> PEP 3118 buffers [#pep-3118]_ can be readonly or writable.  Some objects,
> such as Numpy arrays, need to be backed by a mutable buffer for full
> operation.  Pickle consumers that use the ``buffer_callback`` and
> ``buffers``
> arguments will have to be careful to recreate mutable buffers.  When doing
> I/O, this implies using buffer-passing API variants such as ``readinto``
> (which are also often preferrable for performance).
>
> Data sharing
> 
>
> If you pickle and then unpickle an object in the same process, passing
> out-of-band buffer views, then the unpickled object may be backed by the
> same buffer as the original pickled object.
>
> For example, it might be reasonable to implement reduction of a Numpy array
> as follows (crucial metadata such as shapes is omitted for simplicity)::
>
>class ndarray:
>
>   def __reduce_ex__(self, protocol):
>  if protocol == 5:
> return numpy.frombuffer, (PickleBuffer(self), self.dtype)
>  # Legacy code for earlier protocols omitted
>
> Then simply passing the PickleBuffer around from ``dumps`` to ``loads``
> will produce a new Numpy array sharing the same underlying memory as the
> original Numpy object (and, incidentally, keeping it alive)::


This seems incompatible with v4 semantics. There, a loads plus dumps
combination is approximately a deep copy. This isn't. Sometimes. Sometimes
it is.

Other than that way, I like it.

Rob

>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data

2018-03-28 Thread Nathaniel Smith
On Wed, Mar 28, 2018 at 1:03 PM, Serhiy Storchaka  wrote:
> 28.03.18 21:39, Antoine Pitrou пише:
>> I'd like to submit this PEP for discussion.  It is quite specialized
>> and the main target audience of the proposed changes is
>> users and authors of applications/libraries transferring large amounts
>> of data (read: the scientific computing & data science ecosystems).
>
> Currently I'm working on porting some features from cloudpickle to the
> stdlib. For these of them which can't or shouldn't be implemented in the
> general purpose library (like serializing local functions by serializing
> their code objects, because it is not portable) I want to add hooks that
> would allow to implement them in cloudpickle using official API. This would
> allow cloudpickle to utilize C implementation of the pickler and unpickler.

There's obviously some tension here between pickle's use as a
persistent storage format, and its use as a transient wire format. For
the former, you definitely can't store code objects because there's no
forwards- or backwards-compatibility guarantee for bytecode. But for
the latter, transmitting bytecode is totally fine, because all you
care about is whether it can be decoded once, right now, by some peer
process whose python version you can control -- that's why cloudpickle
exists.

Would it make sense to have a special pickle version that the
transient wire format users could opt into, that only promises
compatibility within a given 3.X release cycle? Like version=-2 or
version=pickle.NONPORTABLE or something?

(This is orthogonal to Antoine's PEP.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Peters
[Tim Delaney ]
> ...
> If I'm not mistaken, #3 would result in the optimiser changing str.format()
> into an f-string in-place. Is this correct? We're not talking here about
> people manually changing the code from str.format() to f-strings, right?

All correct.  It's a magical transformation from one spelling to another.


> I would argue that any optimisation needs to have the same semantics as the
> original code - in this case, that all arguments are evaluated before the
> string is formatted.

That's why Serhiy is asking about it - there _are_ potentially visible
changes in behavior under all but one of his suggestions.


> I also assumed (not having actually used an f-string) that all its
> formatting arguments were evaluated before formatting.

It's a string - it doesn't have "arguments" as such.  For example:

def f(a, b, n):
return f"{a+b:0{n}b}"  # the leading "f" makes it an f-string

Then

>>> f(2, 3, 12)
'0101'

The generated code currently interleaves evaluating expressions with
formatting the results in a more-than-less obvious way, waiting until
the end to paste all the formatted fragments together.  As shown in
the example, this can be more than one level deep (the example needs
to paste together "0", str(n), and "b" to _build_ the format code for
`a+b`).


> So my preference would be (if my understanding in the first line is
> correct):
>
> 1: +0

That's the only suggestion with no potentially visible changes.  I'll
add another:  leave `.format()` alone entirely - there's no _need_ to
"optimize" it, it's just a maybe-nice-to-have.


> 2a: +0.5
> 2b: +1

Those two don't change the behaviors of `.format()`, but _do_ change
some end-case behaviors of f-strings.  If you're overly ;-) concerned
about the former, it would be consistent to be overly concerned about
the latter too.


> 3: -1

And that's the one that proposes to let .format() also interleave
expression evaluation (but still strictly "left to right") with
formatting.

If it were a general code transformation, I'm sure everyone would be
-1.  As is, it's  hard to care.  String formatting is a tiny area, and
format methods are generally purely functional (no side effects).  If
anyone has a non-contrived example where the change would make a lick
of real difference, they shouldn't be shy about posting it :-)  I
looked, and can't find any in my code.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] [RELEASE] Python 3.6.5 is now available

2018-03-28 Thread Ned Deily
Python 3.6.5 is now available. 3.6.5 is the fifth maintenance release of
Python 3.6, which was initially released in 2016-12 to great interest.
Detailed information about the changes made in 3.6.5 can be found in its
change log. You can find Python 3.6.5 and more information here:
https://www.python.org/downloads/release/python-365/

See the "What’s New In Python 3.6" document for more information about
features included in the 3.6 series. Detailed information about the
changes made in 3.6.5 can be found in the change log here:
https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-5-final

Attention macOS users: as of 3.6.5, there is a new additional installer
variant for macOS 10.9+ that includes a built-in version of Tcl/Tk 8.6.
This variant is expected to become the default variant in future
releases. Check it out!

The next maintenance release is expected to follow in about 3 months,
around the end of 2018-06.

Thanks to all of the many volunteers who help make Python Development and
these releases possible! Please consider supporting our efforts by
volunteering yourself or through organization contributions to the Python
Software Foundation:
https://www.python.org/psf/

--
  Ned Deily
  n...@python.org -- []

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Delaney
On 29 March 2018 at 08:09, Tim Delaney  wrote:

> On 29 March 2018 at 07:39, Eric V. Smith  wrote:
>
>> I’d vote #3 as well.
>>
>> > On Mar 28, 2018, at 11:27 AM, Serhiy Storchaka 
>> wrote:
>> >
>> > There is a subtle semantic difference between str.format() and
>> "equivalent" f-string.
>> >
>> >'{}{}'.format(a, b)
>> >f'{a}{b}'
>> >
>> > In most cases this doesn't matter, but when implement the optimization
>> that transforms the former expression to the the latter one ([1], [2]) we
>> have to make a decision what to do with this difference.
>>
>
Sorry about that - finger slipped and I sent an incomplete email ...

If I'm not mistaken, #3 would result in the optimiser changing str.format()
into an f-string in-place. Is this correct? We're not talking here about
people manually changing the code from str.format() to f-strings, right?

I would argue that any optimisation needs to have the same semantics as the
original code - in this case, that all arguments are evaluated before the
string is formatted.

I also assumed (not having actually used an f-string) that all its
formatting arguments were evaluated before formatting.

So my preference would be (if my understanding in the first line is
correct):

1: +0
2a: +0.5
2b: +1
3: -1

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Delaney
On 29 March 2018 at 07:39, Eric V. Smith  wrote:

> I’d vote #3 as well.
>
> > On Mar 28, 2018, at 11:27 AM, Serhiy Storchaka 
> wrote:
> >
> > There is a subtle semantic difference between str.format() and
> "equivalent" f-string.
> >
> >'{}{}'.format(a, b)
> >f'{a}{b}'
> >
> > In most cases this doesn't matter, but when implement the optimization
> that transforms the former expression to the the latter one ([1], [2]) we
> have to make a decision what to do with this difference.
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Eric V. Smith
I’d vote #3 as well. 

--
Eric

> On Mar 28, 2018, at 11:27 AM, Serhiy Storchaka  wrote:
> 
> There is a subtle semantic difference between str.format() and "equivalent" 
> f-string.
> 
>'{}{}'.format(a, b)
>f'{a}{b}'
> 
> In the former case b is evaluated before formatting a. This is equivalent to
> 
>t1 = a
>t2 = b
>t3 = format(t1)
>t4 = format(t2)
>r = t3 + t4
> 
> In the latter case a is formatted before evaluating b. This is equivalent to
> 
>t1 = a
>t2 = format(t1)
>t3 = b
>t4 = format(t3)
>r = t2 + t4
> 
> In most cases this doesn't matter, but when implement the optimization that 
> transforms the former expression to the the latter one ([1], [2]) we have to 
> make a decision what to do with this difference.
> 
> 1. Keep the exact semantic of str.format() when optimize it. This means that 
> it should be transformed into AST node different from the AST node used for 
> f-strings. Either introduce a new AST node type, or add a boolean flag to 
> JoinedStr.
> 
> 2. Change the semantic of f-strings. Make it closer to the semantic of 
> str.format(): evaluate all subexpressions first than format them. This can be 
> implemented in two ways:
> 
> 2a) Add additional instructions for stack manipulations. This will slow down 
> f-strings.
> 
> 2b) Introduce a new complex opcode that will replace FORMAT_VALUE and 
> BUILD_STRING. This will speed up f-strings.
> 
> 3. Transform str.format() into an f-string with changing semantic, and ignore 
> this change. This is not new. The optimizer already changes semantic. 
> Non-optimized "if a and True:" would call bool(a) twice, but optimized code 
> calls it only once.
> 
> [1] https://bugs.python.org/issue28307
> [2] https://bugs.python.org/issue28308
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data

2018-03-28 Thread Antoine Pitrou
On Wed, 28 Mar 2018 23:03:08 +0300
Serhiy Storchaka  wrote:
> 28.03.18 21:39, Antoine Pitrou пише:
>  > I'd like to submit this PEP for discussion.  It is quite specialized
>  > and the main target audience of the proposed changes is
>  > users and authors of applications/libraries transferring large amounts
>  > of data (read: the scientific computing & data science ecosystems).  
> 
> Currently I'm working on porting some features from cloudpickle to the 
> stdlib. For these of them which can't or shouldn't be implemented in the 
> general purpose library (like serializing local functions by serializing 
> their code objects, because it is not portable) I want to add hooks that 
> would allow to implement them in cloudpickle using official API. This 
> would allow cloudpickle to utilize C implementation of the pickler and 
> unpickler.

Yes, that's something that would benefit a lot of people.
For the record, here are my notes on the topic:
https://github.com/cloudpipe/cloudpickle/issues/58#issuecomment-339751408

> It is well known that pickle is unsafe. Unpickling untrusted data can 
> cause executing arbitrary code. It is less known that unpickling can be 
> made safe by controlling resolution of global names in custom 
> Unpickler.find_class(). I want to provide helpers which would help 
> implementing safe unpickling by specifying just white lists of globals 
> and attributes.

I'm not sure how safe that would be, because 1) there may be other
attack vectors, and 2) it's difficult to predict which functions are
entirely safe for calling.  I think the best way to make pickles safe
is to cryptographically sign them so that they cannot be forged by an
attacker.

> This work still is not finished, but I think it is worth to include it 
> in protocol 5 if some features will need bumping protocol version.

Agreed.  Do you know by which timeframe you'll know which opcodes you
want to add?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data

2018-03-28 Thread Serhiy Storchaka

28.03.18 21:39, Antoine Pitrou пише:
> I'd like to submit this PEP for discussion.  It is quite specialized
> and the main target audience of the proposed changes is
> users and authors of applications/libraries transferring large amounts
> of data (read: the scientific computing & data science ecosystems).

Currently I'm working on porting some features from cloudpickle to the 
stdlib. For these of them which can't or shouldn't be implemented in the 
general purpose library (like serializing local functions by serializing 
their code objects, because it is not portable) I want to add hooks that 
would allow to implement them in cloudpickle using official API. This 
would allow cloudpickle to utilize C implementation of the pickler and 
unpickler.


There is a private module _compat_pickle for supporting compatibility of 
moved stdlib classes with Python 2. I'm going to provide public API that 
would allow third-party libraries to support compatibility for moved 
classes and functions. This could also help to support classes and 
function moved in the stdlib after 3.0.


It is well known that pickle is unsafe. Unpickling untrusted data can 
cause executing arbitrary code. It is less known that unpickling can be 
made safe by controlling resolution of global names in custom 
Unpickler.find_class(). I want to provide helpers which would help 
implementing safe unpickling by specifying just white lists of globals 
and attributes.


This work still is not finished, but I think it is worth to include it 
in protocol 5 if some features will need bumping protocol version.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Paul Moore
On 28 March 2018 at 20:12, Serhiy Storchaka  wrote:
> 28.03.18 22:05, Paul Moore пише
>
> I can't imagine (non-contrived) code where the fact that a is
> formatted before b is evaluated would matter, so I'm fine with option
> 3.
>
>
> If formatting a and evaluating b both raise exceptions, the resulting
> exception depends on the order.
>
> $ ./python -bb
 a = b'bytes'
 '{}{}'.format(a, b)
> Traceback (most recent call last):
>   File "", line 1, in 
> NameError: name 'b' is not defined
 f'{a}{b}'
> Traceback (most recent call last):
>   File "", line 1, in 
> BytesWarning: str() on a bytes instance

Thanks, I hadn't thought of that. But I still say that code that
depends on which exception was raised is "contrived".

Anyway, Guido said #3, so no reason to debate it any further :-)

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Stefan Behnel
Serhiy Storchaka schrieb am 28.03.2018 um 17:27:
> There is a subtle semantic difference between str.format() and "equivalent"
> f-string.
> 
>     '{}{}'.format(a, b)
>     f'{a}{b}'
> 
> In the former case b is evaluated before formatting a. This is equivalent to
> 
>     t1 = a
>     t2 = b
>     t3 = format(t1)
>     t4 = format(t2)
>     r = t3 + t4
> 
> In the latter case a is formatted before evaluating b. This is equivalent to
> 
>     t1 = a
>     t2 = format(t1)
>     t3 = b
>     t4 = format(t3)
>     r = t2 + t4
> 
> In most cases this doesn't matter, but when implement the optimization that
> transforms the former expression to the the latter one ([1], [2]) we have
> to make a decision what to do with this difference.

I agree that it's not normally a problem, but if the formatting of 'a'
fails and raises an exception, then 'b' will not get evaluated at all in
the second case. Whether this difference is subtle or not is seems to
depend largely on the code at hand.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Serhiy Storchaka

28.03.18 22:04, Guido van Rossum пише:


Yes, #3, and what Tim says.


Thank you. This helps a much.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Serhiy Storchaka

28.03.18 22:05, Paul Moore пише

I can't imagine (non-contrived) code where the fact that a is
formatted before b is evaluated would matter, so I'm fine with option
3.


If formatting a and evaluating b both raise exceptions, the resulting 
exception depends on the order.


   $ ./python -bb
>>> a = b'bytes'
>>> '{}{}'.format(a, b)
   Traceback (most recent call last):
  File "", line 1, in 
   NameError: name 'b' is not defined
>>> f'{a}{b}'
   Traceback (most recent call last):
  File "", line 1, in 
   BytesWarning: str() on a bytes instance


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Paul Moore
On 28 March 2018 at 19:44, Serhiy Storchaka  wrote:
> 28.03.18 19:20, Guido van Rossum пише:
>
>> Hm, without thinking too much about it I'd say it's okay to change the
>> evaluation order.
>
> Do you mean the option 3, right? This is the simplest option. I have already
> wrote a PR for optimizing old-style formating [1], but have not merged it
> yet due to this change of semantic.

I can't imagine (non-contrived) code where the fact that a is
formatted before b is evaluated would matter, so I'm fine with option
3.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Guido van Rossum
Yes, #3, and what Tim says.

On Wed, Mar 28, 2018, 11:44 Serhiy Storchaka  wrote:

> 28.03.18 19:20, Guido van Rossum пише:
>
> > Hm, without thinking too much about it I'd say it's okay to change the
> > evaluation order.
>
> Do you mean the option 3, right? This is the simplest option. I have
> already wrote a PR for optimizing old-style formating [1], but have not
> merged it yet due to this change of semantic.
>
> > Can these optimizations be disabled with something like -O0?
>
> Currently there is no way to disable optimizations. There is an open
> issue with a request for this. [2]
>
> [1] https://github.com/python/cpython/pull/5012
> [2] https://bugs.python.org/issue2506
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Serhiy Storchaka

28.03.18 21:30, Tim Peters пише:


[Tim]

I have a hard time imaging how that could have come to be, but if it's
true I'd say the unoptimized code was plain wrong.  The dumbest
possible way to implement `f() and g()` is also the correct ;-) way:

result = f()
if not bool(result):
 result = g()

Heh - that's entirely wrong, isn't it?  That's how `or` is implemented ;-)

Same top-level point, though:

result = f()
if bool(result):
 result = g()


Optimized

    if f() and g():
    spam()

is equivalent to

    result = f()
    if bool(result):
    result = g()
        if bool(result):
            spam()

Without optimization it would be equivalent to

    result = f()
    if bool(result):
    result = g()
    if bool(result):
    spam()

It calls bool() for the result of f() twice if it is false.

Thus there is a small difference between

    if f() and g():
    spam()

and

    tmp = f() and g()
    if tmp:
    spam()

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Peters
[Tim]
> Same top-level point, though: [for evaluating `f() and g()`]:
>
> result = f()
> if bool(result):
> result = g()

Ah, I think I see your point now.  In the _context_ of `if f() and
g()`, the dumbest possible code generation would do the above, and
then go on to do

if bool(result):


If in fact `f()` returned a false-like value, an optimizer could note
that `bool(result)` had already been evaluated and skip the redundant
evaluation.  I think that's fine either way:  what the language
guarantees is that `f()` will be evaluated exactly once, and `g()` no
more than once, and that's all so regardless.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Serhiy Storchaka

28.03.18 19:20, Guido van Rossum пише:

Hm, without thinking too much about it I'd say it's okay to change the 
evaluation order.


Do you mean the option 3, right? This is the simplest option. I have 
already wrote a PR for optimizing old-style formating [1], but have not 
merged it yet due to this change of semantic.



Can these optimizations be disabled with something like -O0?


Currently there is no way to disable optimizations. There is an open 
issue with a request for this. [2]


[1] https://github.com/python/cpython/pull/5012
[2] https://bugs.python.org/issue2506

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data

2018-03-28 Thread Antoine Pitrou

Hi,

I'd like to submit this PEP for discussion.  It is quite specialized
and the main target audience of the proposed changes is
users and authors of applications/libraries transferring large amounts
of data (read: the scientific computing & data science ecosystems).

https://www.python.org/dev/peps/pep-0574/

The PEP text is also inlined below.

Regards

Antoine.



PEP: 574
Title: Pickle protocol 5 with out-of-band data
Version: $Revision$
Last-Modified: $Date$
Author: Antoine Pitrou 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 23-Mar-2018
Post-History:
Resolution:


Abstract


This PEP proposes to standardize a new pickle protocol version, and
accompanying APIs to take full advantage of it:

1. A new pickle protocol version (5) to cover the extra metadata needed
   for out-of-band data buffers.
2. A new ``PickleBuffer`` type for ``__reduce_ex__`` implementations
   to return out-of-band data buffers.
3. A new ``buffer_callback`` parameter when pickling, to handle out-of-band
   data buffers.
4. A new ``buffers`` parameter when unpickling to provide out-of-band data
   buffers.

The PEP guarantees unchanged behaviour for anyone not using the new APIs.


Rationale
=

The pickle protocol was originally designed in 1995 for on-disk persistency
of arbitrary Python objects.  The performance of a 1995-era storage medium
probably made it irrelevant to focus on performance metrics such as
use of RAM bandwidth when copying temporary data before writing it to disk.

Nowadays the pickle protocol sees a growing use in applications where most
of the data isn't ever persisted to disk (or, when it is, it uses a portable
format instead of Python-specific).  Instead, pickle is being used to transmit
data and commands from one process to another, either on the same machine
or on multiple machines.  Those applications will sometimes deal with very
large data (such as Numpy arrays or Pandas dataframes) that need to be
transferred around.  For those applications, pickle is currently
wasteful as it imposes spurious memory copies of the data being serialized.

As a matter of fact, the standard ``multiprocessing`` module uses pickle
for serialization, and therefore also suffers from this problem when
sending large data to another process.

Third-party Python libraries, such as Dask [#dask]_, PyArrow [#pyarrow]_
and IPyParallel [#ipyparallel]_, have started implementing alternative
serialization schemes with the explicit goal of avoiding copies on large
data.  Implementing a new serialization scheme is difficult and often
leads to reduced generality (since many Python objects support pickle
but not the new serialization scheme).  Falling back on pickle for
unsupported types is an option, but then you get back the spurious
memory copies you wanted to avoid in the first place.  For example,
``dask`` is able to avoid memory copies for Numpy arrays and
built-in containers thereof (such as lists or dicts containing Numpy
arrays), but if a large Numpy array is an attribute of a user-defined
object, ``dask`` will serialize the user-defined object as a pickle
stream, leading to memory copies.

The common theme of these third-party serialization efforts is to generate
a stream of object metadata (which contains pickle-like information about
the objects being serialized) and a separate stream of zero-copy buffer
objects for the payloads of large objects.  Note that, in this scheme,
small objects such as ints, etc. can be dumped together with the metadata
stream.  Refinements can include opportunistic compression of large data
depending on its type and layout, like ``dask`` does.

This PEP aims to make ``pickle`` usable in a way where large data is handled
as a separate stream of zero-copy buffers, letting the application handle
those buffers optimally.


Example
===

To keep the example simple and avoid requiring knowledge of third-party
libraries, we will focus here on a bytearray object (but the issue is
conceptually the same with more sophisticated objects such as Numpy arrays).
Like most objects, the bytearray object isn't immediately understood by
the pickle module and must therefore specify its decomposition scheme.

Here is how a bytearray object currently decomposes for pickling::

   >>> b.__reduce_ex__(4)
   (, (b'abc',), None)

This is because the ``bytearray.__reduce_ex__`` implementation reads
morally as follows::

   class bytearray:

  def __reduce_ex__(self, protocol):
 if protocol == 4:
return type(self), bytes(self), None
 # Legacy code for earlier protocols omitted

In turn it produces the following pickle code::

   >>> pickletools.dis(pickletools.optimize(pickle.dumps(b, protocol=4)))
   0: \x80 PROTO  4
   2: \x95 FRAME  30
  11: \x8c SHORT_BINUNICODE 'builtins'
  21: \x8c SHORT_BINUNICODE 'bytearray'
  32: \x93 STACK_GLOBAL
  33: CSHORT_BINBYTES b'abc'
  38: \x85 TUPLE1
  

Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Peters
[Tim]
> I have a hard time imaging how that could have come to be, but if it's
> true I'd say the unoptimized code was plain wrong.  The dumbest
> possible way to implement `f() and g()` is also the correct ;-) way:
>
> result = f()
> if not bool(result):
> result = g()

Heh - that's entirely wrong, isn't it?  That's how `or` is implemented ;-)

Same top-level point, though:

result = f()
if bool(result):
result = g()
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Peters
[Serhiy Storchaka ]
> ...
> This is not new. The optimizer already changes semantic.
> Non-optimized "if a and True:" would call bool(a) twice, but optimized code
> calls it only once.

I have a hard time imaging how that could have come to be, but if it's
true I'd say the unoptimized code was plain wrong.  The dumbest
possible way to implement `f() and g()` is also the correct ;-) way:

result = f()
if not bool(result):
result = g()

For the thing you really care about here, the language guarantees `a`
will be evaluated before `b` in:

'{}{}'.format(a, b)

but I'm not sure it says anything about how the format operations are
interleaved.  So your proposed transformation is fine by me (your #3:
still evaluate `a` before `b` but ignore that the format operations
may occur in a different order with respect to those).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Guido van Rossum
Hm, without thinking too much about it I'd say it's okay to change the
evaluation order. Can these optimizations be disabled with something like
-O0?

On Wed, Mar 28, 2018 at 8:27 AM, Serhiy Storchaka 
wrote:

> There is a subtle semantic difference between str.format() and
> "equivalent" f-string.
>
> '{}{}'.format(a, b)
> f'{a}{b}'
>
> In the former case b is evaluated before formatting a. This is equivalent
> to
>
> t1 = a
> t2 = b
> t3 = format(t1)
> t4 = format(t2)
> r = t3 + t4
>
> In the latter case a is formatted before evaluating b. This is equivalent
> to
>
> t1 = a
> t2 = format(t1)
> t3 = b
> t4 = format(t3)
> r = t2 + t4
>
> In most cases this doesn't matter, but when implement the optimization
> that transforms the former expression to the the latter one ([1], [2]) we
> have to make a decision what to do with this difference.
>
> 1. Keep the exact semantic of str.format() when optimize it. This means
> that it should be transformed into AST node different from the AST node
> used for f-strings. Either introduce a new AST node type, or add a boolean
> flag to JoinedStr.
>
> 2. Change the semantic of f-strings. Make it closer to the semantic of
> str.format(): evaluate all subexpressions first than format them. This can
> be implemented in two ways:
>
> 2a) Add additional instructions for stack manipulations. This will slow
> down f-strings.
>
> 2b) Introduce a new complex opcode that will replace FORMAT_VALUE and
> BUILD_STRING. This will speed up f-strings.
>
> 3. Transform str.format() into an f-string with changing semantic, and
> ignore this change. This is not new. The optimizer already changes
> semantic. Non-optimized "if a and True:" would call bool(a) twice, but
> optimized code calls it only once.
>
> [1] https://bugs.python.org/issue28307
> [2] https://bugs.python.org/issue28308
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%
> 40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Serhiy Storchaka
There is a subtle semantic difference between str.format() and 
"equivalent" f-string.


'{}{}'.format(a, b)
f'{a}{b}'

In the former case b is evaluated before formatting a. This is equivalent to

t1 = a
t2 = b
t3 = format(t1)
t4 = format(t2)
r = t3 + t4

In the latter case a is formatted before evaluating b. This is equivalent to

t1 = a
t2 = format(t1)
t3 = b
t4 = format(t3)
r = t2 + t4

In most cases this doesn't matter, but when implement the optimization 
that transforms the former expression to the the latter one ([1], [2]) 
we have to make a decision what to do with this difference.


1. Keep the exact semantic of str.format() when optimize it. This means 
that it should be transformed into AST node different from the AST node 
used for f-strings. Either introduce a new AST node type, or add a 
boolean flag to JoinedStr.


2. Change the semantic of f-strings. Make it closer to the semantic of 
str.format(): evaluate all subexpressions first than format them. This 
can be implemented in two ways:


2a) Add additional instructions for stack manipulations. This will slow 
down f-strings.


2b) Introduce a new complex opcode that will replace FORMAT_VALUE and 
BUILD_STRING. This will speed up f-strings.


3. Transform str.format() into an f-string with changing semantic, and 
ignore this change. This is not new. The optimizer already changes 
semantic. Non-optimized "if a and True:" would call bool(a) twice, but 
optimized code calls it only once.


[1] https://bugs.python.org/issue28307
[2] https://bugs.python.org/issue28308

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com