Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-04-12 Thread Augie Fackler

On Mar 29, 2014, at 2:53 PM, Gregory P. Smith g...@krypto.org wrote:

 
 On Thu, Mar 27, 2014 at 3:05 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Thu, 27 Mar 2014 18:47:59 +
 Brett Cannon bcan...@gmail.com wrote:
  On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org wrote:
 
   Much better, but I'm still not happy with including %s at all. Otherwise
   it's accept-worthy. (How's that for pressure. :-)
  
 
  But if we only add %b and leave out %s then how is this going to lead to
  Python 2/3 compatible code since %b is not in Python 2? Or am I
  misunderstanding you?
 
 I think we have reached a point where adding porting-related facilities
 in 3.5 may actually slow down the pace of porting, rather than
 accelerate it (because people will then wait for 3.5 to start porting
 stuff).
 
 I understand that sentiment but that is an unjustified fear. It is not a good 
 reason not to do it. Projects are already trying to port stuff today and 
 running into roadblocks when it comes to ascii-compatible bytes formatting 
 for real world data formats in code needing to be 2.x compatible. I'm pulling 
 out my practicality beats purity card here.
 
 Mercurial is one of the large Python 2.4-2.7 code bases that needs this 
 feature in order to support Python 3 in a sane manner. (+Augie Fackler to 
 look at the latest http://legacy.python.org/dev/peps/pep-0461/ to confirm 
 usefulness)

That looks sufficient to me - the biggest thing is being able to do 

abort: %s is broken % some_filename_that_is_bytes

and have that work sanely, as well as the numerics. This looks like exactly 
what we need, but I'd love to test it soon (I'm happy to build a 3.5 from tip 
for testing) so that if it's not Right[0] changes can be made before it's 
permanent. Feel encouraged to CC me on patches or something for testing (or 
mail me directly when it lands).

Thanks!

AF

 
 -gps


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-04-12 Thread Terry Reedy

On 4/12/2014 11:08 AM, Augie Fackler wrote:


On Mar 29, 2014, at 2:53 PM, Gregory P. Smith g...@krypto.org
mailto:g...@krypto.org wrote:



On Thu, Mar 27, 2014 at 3:05 PM, Antoine Pitrou solip...@pitrou.net
mailto:solip...@pitrou.net wrote:

On Thu, 27 Mar 2014 18:47:59 +
Brett Cannon bcan...@gmail.com mailto:bcan...@gmail.com wrote:
 On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum
gu...@python.org mailto:gu...@python.org wrote:

  Much better, but I'm still not happy with including %s at all.
Otherwise
  it's accept-worthy. (How's that for pressure. :-)
 

 But if we only add %b and leave out %s then how is this going to
lead to
 Python 2/3 compatible code since %b is not in Python 2? Or am I
 misunderstanding you?

I think we have reached a point where adding porting-related
facilities
in 3.5 may actually slow down the pace of porting, rather than
accelerate it (because people will then wait for 3.5 to start porting
stuff).


I understand that sentiment but that is an unjustified fear. It is not
a good reason not to do it. Projects are already trying to port stuff
today and running into roadblocks when it comes to ascii-compatible
bytes formatting for real world data formats in code needing to be 2.x
compatible. I'm pulling out my practicality beats purity card here.

Mercurial is one of the large Python 2.4-2.7 code bases that needs
this feature in order to support Python 3 in a sane manner. (+Augie
Fackler to look at the latest
http://legacy.python.org/dev/peps/pep-0461/ to confirm usefulness)


That looks sufficient to me - the biggest thing is being able to do

abort: %s is broken % some_filename_that_is_bytes

and have that work sanely, as well as the numerics. This looks like
exactly what we need, but I'd love to test it soon (I'm happy to build a
3.5 from tip for testing) so that if it's not Right[0] changes can be
made before it's permanent. Feel encouraged to CC me on patches or
something for testing (or mail me directly when it lands).


Add yourself as nosy to http://bugs.python.org/issue20284
patch to implement PEP 461 (%-interpolation for bytes)

Indeed, you could help test it the latest version, and others as posted.

--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Gregory P. Smith
On Thu, Mar 27, 2014 at 3:05 PM, Antoine Pitrou solip...@pitrou.net wrote:

 On Thu, 27 Mar 2014 18:47:59 +
 Brett Cannon bcan...@gmail.com wrote:
  On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org
 wrote:
 
   Much better, but I'm still not happy with including %s at all.
 Otherwise
   it's accept-worthy. (How's that for pressure. :-)
  
 
  But if we only add %b and leave out %s then how is this going to lead to
  Python 2/3 compatible code since %b is not in Python 2? Or am I
  misunderstanding you?

 I think we have reached a point where adding porting-related facilities
 in 3.5 may actually slow down the pace of porting, rather than
 accelerate it (because people will then wait for 3.5 to start porting
 stuff).


I understand that sentiment but that is an unjustified fear. It is not a
good reason not to do it. Projects are already trying to port stuff today
and running into roadblocks when it comes to ascii-compatible bytes
formatting for real world data formats in code needing to be 2.x
compatible. I'm pulling out my practicality beats purity card here.

Mercurial is one of the large Python 2.4-2.7 code bases that needs this
feature in order to support Python 3 in a sane manner. (+Augie Fackler to
look at the latest http://legacy.python.org/dev/peps/pep-0461/ to confirm
usefulness)

-gps
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Antoine Pitrou
On Sat, 29 Mar 2014 11:53:45 -0700
Gregory P. Smith g...@krypto.org wrote:
 
 I understand that sentiment but that is an unjustified fear. It is not a
 good reason not to do it. Projects are already trying to port stuff today
 and running into roadblocks when it comes to ascii-compatible bytes
 formatting for real world data formats in code needing to be 2.x
 compatible. I'm pulling out my practicality beats purity card here.

Roadblocks is an unjustified term here. Important code bases such as
Tornado have already achieved this a long time ago. While lack of bytes
formatting does make porting harder, it is not a roadblock as in
you can't work it around.

 Mercurial is one of the large Python 2.4-2.7 code bases that needs this
 feature in order to support Python 3 in a sane manner. (+Augie Fackler to
 look at the latest http://legacy.python.org/dev/peps/pep-0461/ to confirm
 usefulness)

http://www.selenic.com/pipermail/mercurial-devel/2014-March/057474.html

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Gregory P. Smith
On Thu, Mar 27, 2014 at 3:47 AM, Victor Stinner victor.stin...@gmail.comwrote:

 The PEP 461 looks good to me. It's a nice addition to Python 3.5 and
 the PEP is well defined.


+1


 I can help to implement it. Maybe, it would be nice to provide an
 implementation as a third-party party module on PyPI for Python
 2.6-3.4.


That is possible and would enable bytes formatting on earlier 3.x versions.
I'm not sure if there is any value in backporting to 2.x as those already
have such formatting with Python 2's str.__mod__ % operator.

Though I don't know what it'd look like as an API as a module.
Brainstorming: It'd either involve function calls to format instead of % or
a container class to wrap format strings in with a __mod__ method that
calls the bytes formatting code instead of native str % formatting when
needed.

From a 2.x-3.x compatible code standpoint the above could exist but the
container class constructor would be a no-op on Python 2.
  if sys.version_info[0] == 2:
BytesFormatter = str
  else:
class BytesFormatter: ... def __mod__ ...

-gps


 Note: I fixed a typo in your PEP (reST syntax).

 Victor

 2014-03-26 23:47 GMT+01:00 Ethan Furman et...@stoneleaf.us:
  This one is wrong:
 
  repr(b'abc').encode('ascii', 'backslashreplace')
 
  bb'abc'
 
 
  Fixed, thanks.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/greg%40krypto.org

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Glenn Linderman

On 3/29/2014 12:01 PM, Gregory P. Smith wrote:
From a 2.x-3.x compatible code standpoint the above could exist but 
the container class constructor would be a no-op on Python 2.

  if sys.version_info[0] == 2:
BytesFormatter = str
  else:
class BytesFormatter: ... def __mod__ ...


If done as a container class, the Python 2 version should implement the 
same restrictions on %s for numerics, and implement %b.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Ethan Furman

On 03/29/2014 11:59 AM, Antoine Pitrou wrote:

On Sat, 29 Mar 2014 11:53:45 -0700 Gregory P. Smith wrote:


I understand that sentiment but that is an unjustified fear. It is not a
good reason not to do it. Projects are already trying to port stuff today
and running into roadblocks when it comes to ascii-compatible bytes
formatting for real world data formats in code needing to be 2.x
compatible. I'm pulling out my practicality beats purity card here.


Roadblocks is an unjustified term here.


It's actually quite appropriate:  to get around a physical roadblock you would have to leave the road, forge through 
lumpy ground and stinging nettles, and then get back on the road.


A very good analogy, actually.  ;)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Nick Coghlan
On 30 March 2014 07:01, Ethan Furman et...@stoneleaf.us wrote:
 On 03/29/2014 11:59 AM, Antoine Pitrou wrote:

 On Sat, 29 Mar 2014 11:53:45 -0700 Gregory P. Smith wrote:


 I understand that sentiment but that is an unjustified fear. It is not a
 good reason not to do it. Projects are already trying to port stuff today
 and running into roadblocks when it comes to ascii-compatible bytes
 formatting for real world data formats in code needing to be 2.x
 compatible. I'm pulling out my practicality beats purity card here.


 Roadblocks is an unjustified term here.


 It's actually quite appropriate:  to get around a physical roadblock you
 would have to leave the road, forge through lumpy ground and stinging
 nettles, and then get back on the road.

 A very good analogy, actually.  ;)

I tend to call them barriers to migration. Up to Python 3.4, my
focus has been more on general barriers to entry for Python 3 that
applied as much or more to new users as they did to existing ones -
hence working on getting pip incorporated, providing a better path to
mastery for the codec system, helping Larry with Argument Clinic,
helping Eric with the simpler import customisation, trying to help
improve the integration with the POSIX text model, assorted tweaks to
make the type system more accessible etc.

I think Python 3.4 is now in a pretty good place on that front,
particularly with Larry stating up front that he considers the ongoing
rollout of Argument Clinic usage to be in scope for Python 3.4.x
maintenance releases.

So for 3.5, I think it makes sense to focus on those barriers to
migration and other activities that benefit existing Python 2 users
more so than users that are completely new to Python and starting
directly with Python 3. Binary interpolation is a big one (thanks
Ethan!), as is the proposed policy change to allow network security
features to evolve within Python 2.7 maintenance releases.

Our community has done a lot of work to support us in our goal of
modernising and migrating a large fraction of the ecosystem to a new
version of the language, even though the full implications of the
revised models for binary and text data turned out to be more profound
than I think any of us realised back in 2006 when Guido first turned
the previously hypothetical Py3k into a genuine active effort to
create a new revision of the language, better suited to the global
nature of the 21st century.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Terry Reedy



On Thu, Mar 27, 2014 at 3:05 PM, Antoine Pitrou solip...@pitrou.net
mailto:solip...@pitrou.net wrote:



I think we have reached a point where adding porting-related facilities


AFAIK, The only porting specific feature is %s as a synonym for %b. Not 
pretty, but tolerable. Otherwise, I have the impression that the PEP 
pretty much stands on its own.



in 3.5 may actually slow down the pace of porting, rather than
accelerate it (because people will then wait for 3.5 to start porting
stuff).


Or, they should download the source and compile and continue or start 
porting as soon as the bytes % is added. Having earlier Windows and Mac 
preview binaries might help a tiny bit.


If you are saying that Py3 development should not be driven by Py2 
concerns, I agree.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Nick Coghlan
On 30 March 2014 05:01, Gregory P. Smith g...@krypto.org wrote:



 On Thu, Mar 27, 2014 at 3:47 AM, Victor Stinner victor.stin...@gmail.com
 wrote:

 The PEP 461 looks good to me. It's a nice addition to Python 3.5 and
 the PEP is well defined.


 +1


 I can help to implement it. Maybe, it would be nice to provide an
 implementation as a third-party party module on PyPI for Python
 2.6-3.4.


 That is possible and would enable bytes formatting on earlier 3.x versions.
 I'm not sure if there is any value in backporting to 2.x as those already
 have such formatting with Python 2's str.__mod__ % operator.

 Though I don't know what it'd look like as an API as a module.
 Brainstorming: It'd either involve function calls to format instead of % or
 a container class to wrap format strings in with a __mod__ method that calls
 the bytes formatting code instead of native str % formatting when needed.

The future project already contains a full backport of a true bytes
type, rather than relying on Python 2 str objects:
http://python-future.org/what_else.html#bytes

It seems to me that the easiest way to make any forthcoming Python 3.5
enhancements (both binary interpolation and the other cleanups we are
discussing over on Python ideas) available to single source 2/3 code
bases is to commit to an API freeze for *those particular builtins*
early, and then update future accordingly.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-29 Thread Terry Reedy

On 3/29/2014 8:28 PM, Nick Coghlan wrote:


The future project already contains a full backport of a true bytes
type, rather than relying on Python 2 str objects:
http://python-future.org/what_else.html#bytes


That project looks really nice!


It seems to me that the easiest way to make any forthcoming Python 3.5
enhancements (both binary interpolation and the other cleanups we are
discussing over on Python ideas) available to single source 2/3 code
bases is to commit to an API freeze for *those particular builtins*
early, and then update future accordingly.


I agree. I think syntax changes should be in by the first alpha, if not 
before.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-28 Thread Ethan Furman

On 03/27/2014 04:26 AM, Nick Coghlan wrote:

On 27 March 2014 20:47, Victor Stinner victor.stin...@gmail.com wrote:

The PEP 461 looks good to me. It's a nice addition to Python 3.5 and
the PEP is well defined.


+1 from me as well. One minor request is that I don't think the
rationale for rejecting numbers from %s is complete [...]


Changed to
-
In particular, ``%s`` will not accept numbers nor ``str``.  ``str`` is rejected
as the string to bytes conversion requires an encoding, and we are refusing to
guess; numbers are rejected because:

  - what makes a number is fuzzy (float? Decimal? Fraction? some user type?)

  - allowing numbers would lead to ambiguity between numbers and textual
representations of numbers (3.14 vs '3.14')

  - given the nature of wire formats, explicit is definitely better than 
implicit
-


Note: I fixed a typo in your PEP (reST syntax).


I also committed a couple of markup tweaks, since it seemed easier to
just fix them than explain what was broken.


Thanks to both of you for that.


However, there are also
two dead footnotes (4  5), which I have left alone - I'm not sure if
the problem is a missing reference, or if the footnote can go away
now.


Fixed.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Victor Stinner
The PEP 461 looks good to me. It's a nice addition to Python 3.5 and
the PEP is well defined.

I can help to implement it. Maybe, it would be nice to provide an
implementation as a third-party party module on PyPI for Python
2.6-3.4.

Note: I fixed a typo in your PEP (reST syntax).

Victor

2014-03-26 23:47 GMT+01:00 Ethan Furman et...@stoneleaf.us:
 This one is wrong:

 repr(b'abc').encode('ascii', 'backslashreplace')

 bb'abc'


 Fixed, thanks.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Antoine Pitrou
On Tue, 25 Mar 2014 15:37:11 -0700
Ethan Furman et...@stoneleaf.us wrote:
 
 ``%a`` will call ``ascii()`` on the interpolated value.  This is intended
 as a debugging aid, rather than something that should be used in production.
 Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
 representation.  Use cases include developing a new protocol and writing
 landmarks into the stream; debugging data going into an existing protocol
 to see if the problem is the protocol itself or bad data; a fall-back for a
 serialization format; or even a rudimentary serialization format when
 defining ``__bytes__`` would not be appropriate [8].

The use cases you are enumerating for %a are chimeric. Did you
*actually* do those things in real life, or are you inventing them for
the PEP?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Nick Coghlan
On 27 March 2014 20:47, Victor Stinner victor.stin...@gmail.com wrote:
 The PEP 461 looks good to me. It's a nice addition to Python 3.5 and
 the PEP is well defined.

+1 from me as well. One minor request is that I don't think the
rationale for rejecting numbers from %s is incomplete - IIRC, the
problem there is that the normal path for handling those is the
coercion via str() and this proposal deliberately *doesn't* allow that
path. That means supporting numbers would mean writing a lot of
*additional* code, and that isn't needed since 2/3 compatible code can
just be adjusted to use an appropriate numeric code.

 Note: I fixed a typo in your PEP (reST syntax).

I also committed a couple of markup tweaks, since it seemed easier to
just fix them than explain what was broken. However, there are also
two dead footnotes (4  5), which I have left alone - I'm not sure if
the problem is a missing reference, or if the footnote can go away
now.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Nick Coghlan
On 27 March 2014 21:24, Antoine Pitrou solip...@pitrou.net wrote:
 On Tue, 25 Mar 2014 15:37:11 -0700
 Ethan Furman et...@stoneleaf.us wrote:

 ``%a`` will call ``ascii()`` on the interpolated value.  This is intended
 as a debugging aid, rather than something that should be used in production.
 Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
 representation.  Use cases include developing a new protocol and writing
 landmarks into the stream; debugging data going into an existing protocol
 to see if the problem is the protocol itself or bad data; a fall-back for a
 serialization format; or even a rudimentary serialization format when
 defining ``__bytes__`` would not be appropriate [8].

 The use cases you are enumerating for %a are chimeric. Did you
 *actually* do those things in real life, or are you inventing them for
 the PEP?

I'm the one that raised the discourage misuse of __bytes__ concern,
so I'd like %a to stay in at least for that reason. %a is a perfectly
well defined format code (albeit one you'd only be likely to use while
messing about with serialisation protocols, as the PEP describes - for
example, if a %b code was ending up producing wrong data, you might
switch to %a temporarily to get a better idea of where the bad data
was coming from), while using __bytes__ to make %s behave the way %a
is defined in the PEP would just be wrong in most cases. I consider %a
the preemptive PEP 308 of binary interpolation format codes - in the
absence of %a, I'm certain that users would end up abusing __bytes__
and %s to get the same effect, just as they used the known bug magnet
that was the and/or hack for a long time in the absence of PEP 308.

I also seem to recall Guido saying he liked it, which flipped the
discussion from do we have a good rationale for including it? to do
we have a good rationale for the BDFL to ignore his instincts?.
However, it would be up to Guido to confirm that recollection, and if
Guido likes it is part of the reason for inclusion of the %a code,
the PEP should mention that explicitly.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread R. David Murray
On Thu, 27 Mar 2014 12:24:49 +0100, Antoine Pitrou solip...@pitrou.net wrote:
 On Tue, 25 Mar 2014 15:37:11 -0700
 Ethan Furman et...@stoneleaf.us wrote:
  
  ``%a`` will call ``ascii()`` on the interpolated value.  This is intended
  as a debugging aid, rather than something that should be used in production.
  Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
  representation.  Use cases include developing a new protocol and writing
  landmarks into the stream; debugging data going into an existing protocol
  to see if the problem is the protocol itself or bad data; a fall-back for a
  serialization format; or even a rudimentary serialization format when
  defining ``__bytes__`` would not be appropriate [8].
 
 The use cases you are enumerating for %a are chimeric. Did you
 *actually* do those things in real life, or are you inventing them for
 the PEP?

The use cases came from someone else (Jim Jewett?) so you should
be asking him, not Ethan :)

As for the did you actually do those things in real life, I know I've
done the dump the repr into the data (protocol) stream to see what
I've really got here debug trick in the string context, so I have no
doubt that I will want to do it in the bytes context as well.  In fact,
it is probably somewhat more likely in the bytes context, since I know
I've been in situations with data exchange protocols where I couldn't
get console output and setting up logging was much more painful than
just dumping the debug data into into the data stream.  Or where doing
so made it much clearer what was going on than separate logging would.
I've done the 'landmark' thing as well, in the string context; that can be
very useful when doing incremental test driven development.  (Granted, you
could do that with __bytes__; you might well be writing a __bytes__
method anyway as the next step, but it *is* more overhead/boilerplate than
just starting with %a...and it gets people used to reaching for __bytes__
for the wrong purpose, which is Nick's concern).  In theory I can see
using %a for serialization in certain limited contexts (I've done that
with string repr in private utility scripts), but in practice I doubt
that would happen in a binary context, since those are much more likely
to be actually going over a wire of some sort (ie: places you really
don't want to use eval even when it would work).

So yeah, I think %a has *practical* utility.

--David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 04:24 AM, Antoine Pitrou wrote:

On Tue, 25 Mar 2014 15:37:11 -0700 Ethan Furman wrote:


``%a`` will call ``ascii()`` on the interpolated value.  This is intended
as a debugging aid, rather than something that should be used in production.
Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
representation.  Use cases include developing a new protocol and writing
landmarks into the stream; debugging data going into an existing protocol
to see if the problem is the protocol itself or bad data; a fall-back for a
serialization format; or even a rudimentary serialization format when
defining ``__bytes__`` would not be appropriate [8].


The use cases you are enumerating for %a are chimeric.


Cool word!  Haven't seen it a long time.  :)


Did you *actually* do those things in real life, or are you inventing them
for the PEP?


The examples came from Jim Jewett, but I can easily see myself using them.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 04:42 AM, Nick Coghlan wrote:


I also seem to recall Guido saying he liked it [%a], which flipped the
discussion from do we have a good rationale for including it? to do
we have a good rationale for the BDFL to ignore his instincts?.
However, it would be up to Guido to confirm that recollection, and if
Guido likes it is part of the reason for inclusion of the %a code,
the PEP should mention that explicitly.


I checked Guido's posts (Subject contains PEP 461, From contains guido) and did 
not see anything to that effect.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
Actually, I had ignored this discussion for so long that I was surprised by
the outcome. My main use case isn't printing a number that may already be a
string (I understand why that isn't reasonable when the output is expected
to be bytes); it's printing a usually numeric value that may sometimes be
None. It's a little surprising to have to use %a for this, but I guess I
can live with it.


On Thu, Mar 27, 2014 at 8:58 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 03/27/2014 04:42 AM, Nick Coghlan wrote:


 I also seem to recall Guido saying he liked it [%a], which flipped the

 discussion from do we have a good rationale for including it? to do
 we have a good rationale for the BDFL to ignore his instincts?.
 However, it would be up to Guido to confirm that recollection, and if
 Guido likes it is part of the reason for inclusion of the %a code,
 the PEP should mention that explicitly.


 I checked Guido's posts (Subject contains PEP 461, From contains guido)
 and did not see anything to that effect.

 --
 ~Ethan~

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 guido%40python.org




-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
I also don't understand why we can't use %b instead of %s. AFAIK %b
currently doesn't mean anything and I somehow don't expect we're likely to
add it for other reasons (unless there's a proposal I'm missing?). Just
like we use %a instead of %r to remind people that it's not quite the same
(since it applies .encode('ascii', 'backslashreplace')), shouldn't we use
anything *but* %s to remind people that that is also not the same (not at
all, in fact)? The PEP's argument against %b (rejected as not adding any
value either in clarity or simplicity) is hardly a good reason.


On Thu, Mar 27, 2014 at 10:20 AM, Guido van Rossum gu...@python.org wrote:

 Actually, I had ignored this discussion for so long that I was surprised
 by the outcome. My main use case isn't printing a number that may already
 be a string (I understand why that isn't reasonable when the output is
 expected to be bytes); it's printing a usually numeric value that may
 sometimes be None. It's a little surprising to have to use %a for this, but
 I guess I can live with it.


 On Thu, Mar 27, 2014 at 8:58 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 03/27/2014 04:42 AM, Nick Coghlan wrote:


 I also seem to recall Guido saying he liked it [%a], which flipped the

 discussion from do we have a good rationale for including it? to do
 we have a good rationale for the BDFL to ignore his instincts?.
 However, it would be up to Guido to confirm that recollection, and if
 Guido likes it is part of the reason for inclusion of the %a code,
 the PEP should mention that explicitly.


 I checked Guido's posts (Subject contains PEP 461, From contains guido)
 and did not see anything to that effect.

 --
 ~Ethan~

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 guido%40python.org




 --
 --Guido van Rossum (python.org/~guido)




-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread MRAB

On 2014-03-27 15:58, Ethan Furman wrote:

On 03/27/2014 04:42 AM, Nick Coghlan wrote:


I also seem to recall Guido saying he liked it [%a], which flipped the
discussion from do we have a good rationale for including it? to do
we have a good rationale for the BDFL to ignore his instincts?.
However, it would be up to Guido to confirm that recollection, and if
Guido likes it is part of the reason for inclusion of the %a code,
the PEP should mention that explicitly.


I checked Guido's posts (Subject contains PEP 461, From contains guido) and did 
not see anything to that effect.


Date: Mon, 13 Jan 2014 12:09:23 -0800
Subject: Re: [Python-Dev] PEP 460 reboot

If we have %b for strictly interpolating bytes, I'm fine with adding
%a for calling ascii() on the argument and then interpolating the
result after ASCII-encoding it.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
Wow. I'm pretty consistent. I still like that. :-)


On Thu, Mar 27, 2014 at 10:31 AM, MRAB pyt...@mrabarnett.plus.com wrote:

 On 2014-03-27 15:58, Ethan Furman wrote:

 On 03/27/2014 04:42 AM, Nick Coghlan wrote:


 I also seem to recall Guido saying he liked it [%a], which flipped the
 discussion from do we have a good rationale for including it? to do
 we have a good rationale for the BDFL to ignore his instincts?.
 However, it would be up to Guido to confirm that recollection, and if
 Guido likes it is part of the reason for inclusion of the %a code,
 the PEP should mention that explicitly.


 I checked Guido's posts (Subject contains PEP 461, From contains guido)
 and did not see anything to that effect.

  Date: Mon, 13 Jan 2014 12:09:23 -0800
 Subject: Re: [Python-Dev] PEP 460 reboot

 If we have %b for strictly interpolating bytes, I'm fine with adding
 %a for calling ascii() on the argument and then interpolating the
 result after ASCII-encoding it.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 guido%40python.org




-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 10:29 AM, Guido van Rossum wrote:


I also don't understand why we can't use %b instead of %s. AFAIK %b currently 
doesn't mean anything and I somehow don't
expect we're likely to add it for other reasons (unless there's a proposal I'm 
missing?). Just like we use %a instead of
%r to remind people that it's not quite the same (since it applies 
.encode('ascii', 'backslashreplace')), shouldn't we
use anything *but* %s to remind people that that is also not the same (not at 
all, in fact)? The PEP's argument against
%b (rejected as not adding any value either in clarity or simplicity) is 
hardly a good reason.


The biggest reason to use %s is to support a common code base for 2/3 endeavors.  The biggest reason to not include %b 
is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't 
a very strong argument against it.


I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's 
appropriate).


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
On Thu, Mar 27, 2014 at 10:55 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 03/27/2014 10:29 AM, Guido van Rossum wrote:


 I also don't understand why we can't use %b instead of %s. AFAIK %b
 currently doesn't mean anything and I somehow don't
 expect we're likely to add it for other reasons (unless there's a
 proposal I'm missing?). Just like we use %a instead of
 %r to remind people that it's not quite the same (since it applies
 .encode('ascii', 'backslashreplace')), shouldn't we
 use anything *but* %s to remind people that that is also not the same
 (not at all, in fact)? The PEP's argument against
 %b (rejected as not adding any value either in clarity or simplicity)
 is hardly a good reason.


 The biggest reason to use %s is to support a common code base for 2/3
 endeavors.


But it's mostly useless for that purpose. In Python 2, in practice %s
doesn't mean string. It means use the default formatting just as if I
was using print. And in theory it also means that -- in fact call
__str__() is the formal definition, and print is also defined as using
__str__, and this is all intentional. (I also intended __str__ to be
*mostly* the same as __repr__, with a specific exception for the str type
itself. In practice some frameworks have adopted a different
interpretation, making __repr__ produce something *more* user friendly
than __str__ but including newlines, because some people believe the main
use case for __repr__ is the interactive prompt. I believe this causes
problems for some *other* uses of __repr__, such as for producing an
unambiguous representation useful for e.g. logging -- but I don't want to
be too bitter about it. :-)

The biggest reason to not include %b is that it means binary number in
 format(); given that each type can invent it's own mini-language, this
 probably isn't a very strong argument against it.


Especially since I can't imagine the spelling in format() includes '%'.


 I have moderate feelings for keeping %s as a synonym for %b for backwards
 compatibility with Py2 code (when it's appropriate).


I think it's mere existence (with the restrictions currently in the PEP)
would cause more confusion than that is worth.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 10:55 AM, Ethan Furman wrote:

On 03/27/2014 10:29 AM, Guido van Rossum wrote:


I also don't understand why we can't use %b instead of %s. AFAIK %b currently 
doesn't mean anything and I somehow don't
expect we're likely to add it for other reasons (unless there's a proposal I'm 
missing?). Just like we use %a instead of
%r to remind people that it's not quite the same (since it applies 
.encode('ascii', 'backslashreplace')), shouldn't we
use anything *but* %s to remind people that that is also not the same (not at 
all, in fact)? The PEP's argument against
%b (rejected as not adding any value either in clarity or simplicity) is 
hardly a good reason.


The biggest reason to use %s is to support a common code base for 2/3 
endeavors.  The biggest reason to not include %b
is that it means binary number in format(); given that each type can invent 
it's own mini-language, this probably isn't
a very strong argument against it.

I have moderate feelings for keeping %s as a synonym for %b for backwards 
compatibility with Py2 code (when it's
appropriate).


Changed to:
--
``%b`` will insert a series of bytes.  These bytes are collected in one of two
ways:

  - input type supports ``Py_buffer`` [4]_?
use it to collect the necessary bytes

  - input type is something else?
use its ``__bytes__`` method [5]_ ; if there isn't one, raise a 
``TypeError``

In particular, ``%b`` will not accept numbers nor ``str``.  ``str`` is rejected
as the string to bytes conversion requires an encoding, and we are refusing to
guess; numbers are rejected because:

  - what makes a number is fuzzy (float? Decimal? Fraction? some user type?)

  - allowing numbers would lead to ambiguity between numbers and textual
representations of numbers (3.14 vs '3.14')

  - given the nature of wire formats, explicit is definitely better than 
implicit

``%s`` is included as a synonym for ``%b`` for the sole purpose of making 2/3 
code
bases easier to maintain.  Python 3 only code should use ``%b``.

Examples::

 b'%b' % b'abc'
b'abc'

 b'%b' % 'some string'.encode('utf8')
b'some string'

 b'%b' % 3.14
Traceback (most recent call last):
...
TypeError: b'%b' does not accept 'float'

 b'%b' % 'hello world!'
Traceback (most recent call last):
...
TypeError: b'%b' does not accept 'str'
--

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
Much better, but I'm still not happy with including %s at all. Otherwise
it's accept-worthy. (How's that for pressure. :-)


On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 03/27/2014 10:55 AM, Ethan Furman wrote:

 On 03/27/2014 10:29 AM, Guido van Rossum wrote:


 I also don't understand why we can't use %b instead of %s. AFAIK %b
 currently doesn't mean anything and I somehow don't
 expect we're likely to add it for other reasons (unless there's a
 proposal I'm missing?). Just like we use %a instead of
 %r to remind people that it's not quite the same (since it applies
 .encode('ascii', 'backslashreplace')), shouldn't we
 use anything *but* %s to remind people that that is also not the same
 (not at all, in fact)? The PEP's argument against
 %b (rejected as not adding any value either in clarity or simplicity)
 is hardly a good reason.


 The biggest reason to use %s is to support a common code base for 2/3
 endeavors.  The biggest reason to not include %b
 is that it means binary number in format(); given that each type can
 invent it's own mini-language, this probably isn't
 a very strong argument against it.

 I have moderate feelings for keeping %s as a synonym for %b for backwards
 compatibility with Py2 code (when it's
 appropriate).


 Changed to:
 
 --
 ``%b`` will insert a series of bytes.  These bytes are collected in one of
 two
 ways:

   - input type supports ``Py_buffer`` [4]_?

 use it to collect the necessary bytes

   - input type is something else?
 use its ``__bytes__`` method [5]_ ; if there isn't one, raise a
 ``TypeError``

 In particular, ``%b`` will not accept numbers nor ``str``.  ``str`` is
 rejected
 as the string to bytes conversion requires an encoding, and we are
 refusing to
 guess; numbers are rejected because:

   - what makes a number is fuzzy (float? Decimal? Fraction? some user
 type?)

   - allowing numbers would lead to ambiguity between numbers and textual
 representations of numbers (3.14 vs '3.14')

   - given the nature of wire formats, explicit is definitely better than
 implicit

 ``%s`` is included as a synonym for ``%b`` for the sole purpose of making
 2/3 code
 bases easier to maintain.  Python 3 only code should use ``%b``.

 Examples::

  b'%b' % b'abc'
 b'abc'

  b'%b' % 'some string'.encode('utf8')
 b'some string'

  b'%b' % 3.14

 Traceback (most recent call last):
 ...
 TypeError: b'%b' does not accept 'float'

  b'%b' % 'hello world!'

 Traceback (most recent call last):
 ...
 TypeError: b'%b' does not accept 'str'
 
 --


 --
 ~Ethan~
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 guido%40python.org




-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Brett Cannon
On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org wrote:

 Much better, but I'm still not happy with including %s at all. Otherwise
 it's accept-worthy. (How's that for pressure. :-)


But if we only add %b and leave out %s then how is this going to lead to
Python 2/3 compatible code since %b is not in Python 2? Or am I
misunderstanding you?

-Brett




 On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 03/27/2014 10:55 AM, Ethan Furman wrote:

 On 03/27/2014 10:29 AM, Guido van Rossum wrote:


 I also don't understand why we can't use %b instead of %s. AFAIK %b
 currently doesn't mean anything and I somehow don't
 expect we're likely to add it for other reasons (unless there's a
 proposal I'm missing?). Just like we use %a instead of
 %r to remind people that it's not quite the same (since it applies
 .encode('ascii', 'backslashreplace')), shouldn't we
 use anything *but* %s to remind people that that is also not the same
 (not at all, in fact)? The PEP's argument against
 %b (rejected as not adding any value either in clarity or simplicity)
 is hardly a good reason.


 The biggest reason to use %s is to support a common code base for 2/3
 endeavors.  The biggest reason to not include %b
 is that it means binary number in format(); given that each type can
 invent it's own mini-language, this probably isn't
 a very strong argument against it.

 I have moderate feelings for keeping %s as a synonym for %b for
 backwards compatibility with Py2 code (when it's
 appropriate).


 Changed to:
 
 --
 ``%b`` will insert a series of bytes.  These bytes are collected in one
 of two
 ways:

   - input type supports ``Py_buffer`` [4]_?

 use it to collect the necessary bytes

   - input type is something else?
 use its ``__bytes__`` method [5]_ ; if there isn't one, raise a
 ``TypeError``

 In particular, ``%b`` will not accept numbers nor ``str``.  ``str`` is
 rejected
 as the string to bytes conversion requires an encoding, and we are
 refusing to
 guess; numbers are rejected because:

   - what makes a number is fuzzy (float? Decimal? Fraction? some user
 type?)

   - allowing numbers would lead to ambiguity between numbers and textual
 representations of numbers (3.14 vs '3.14')

   - given the nature of wire formats, explicit is definitely better than
 implicit

 ``%s`` is included as a synonym for ``%b`` for the sole purpose of making
 2/3 code
 bases easier to maintain.  Python 3 only code should use ``%b``.

 Examples::

  b'%b' % b'abc'
 b'abc'

  b'%b' % 'some string'.encode('utf8')
 b'some string'

  b'%b' % 3.14

 Traceback (most recent call last):
 ...
 TypeError: b'%b' does not accept 'float'

  b'%b' % 'hello world!'

 Traceback (most recent call last):
 ...
 TypeError: b'%b' does not accept 'str'
 
 --



 --
 ~Ethan~
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev

 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 guido%40python.org




 --
 --Guido van Rossum (python.org/~guido)
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 brett%40python.org

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Daniel Holth
I feel not including %s is nuts. Should I write .replace('%b', '%s')?
All I desperately need are APIs that provide enough unicode / str type
safety that I get an exception when mixing them accidentally... in my
own code, dynamic typing is usually a bug. As has been endlessly
discussed, %s for bytes is a bit like exposing sprintf()...

On Thu, Mar 27, 2014 at 2:41 PM, Guido van Rossum gu...@python.org wrote:
 Much better, but I'm still not happy with including %s at all. Otherwise
 it's accept-worthy. (How's that for pressure. :-)


 On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 03/27/2014 10:55 AM, Ethan Furman wrote:

 On 03/27/2014 10:29 AM, Guido van Rossum wrote:


 I also don't understand why we can't use %b instead of %s. AFAIK %b
 currently doesn't mean anything and I somehow don't
 expect we're likely to add it for other reasons (unless there's a
 proposal I'm missing?). Just like we use %a instead of
 %r to remind people that it's not quite the same (since it applies
 .encode('ascii', 'backslashreplace')), shouldn't we
 use anything *but* %s to remind people that that is also not the same
 (not at all, in fact)? The PEP's argument against
 %b (rejected as not adding any value either in clarity or simplicity)
 is hardly a good reason.


 The biggest reason to use %s is to support a common code base for 2/3
 endeavors.  The biggest reason to not include %b
 is that it means binary number in format(); given that each type can
 invent it's own mini-language, this probably isn't
 a very strong argument against it.

 I have moderate feelings for keeping %s as a synonym for %b for backwards
 compatibility with Py2 code (when it's
 appropriate).


 Changed to:

 --
 ``%b`` will insert a series of bytes.  These bytes are collected in one of
 two
 ways:

   - input type supports ``Py_buffer`` [4]_?

 use it to collect the necessary bytes

   - input type is something else?
 use its ``__bytes__`` method [5]_ ; if there isn't one, raise a
 ``TypeError``

 In particular, ``%b`` will not accept numbers nor ``str``.  ``str`` is
 rejected
 as the string to bytes conversion requires an encoding, and we are
 refusing to
 guess; numbers are rejected because:

   - what makes a number is fuzzy (float? Decimal? Fraction? some user
 type?)

   - allowing numbers would lead to ambiguity between numbers and textual
 representations of numbers (3.14 vs '3.14')

   - given the nature of wire formats, explicit is definitely better than
 implicit

 ``%s`` is included as a synonym for ``%b`` for the sole purpose of making
 2/3 code
 bases easier to maintain.  Python 3 only code should use ``%b``.

 Examples::

  b'%b' % b'abc'
 b'abc'

  b'%b' % 'some string'.encode('utf8')
 b'some string'

  b'%b' % 3.14

 Traceback (most recent call last):
 ...
 TypeError: b'%b' does not accept 'float'

  b'%b' % 'hello world!'

 Traceback (most recent call last):
 ...
 TypeError: b'%b' does not accept 'str'

 --


 --
 ~Ethan~
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/guido%40python.org




 --
 --Guido van Rossum (python.org/~guido)

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
So what's the use case for Python 2/3 compatible code? IMO the main use
case for the PEP is simply to be able to construct bytes from a combination
of a template and some input that may include further bytes and numbers.
E.g. in asyncio when you write an HTTP client or server you have to
construct bytes to write to the socket, and I'd be happy if I could write
b'HTTP/1.0 %d %b\r\n' % (status, message) rather than having to use
str(status).encode('ascii') and concatenation or join().


On Thu, Mar 27, 2014 at 11:47 AM, Brett Cannon bcan...@gmail.com wrote:



 On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org
 wrote:

 Much better, but I'm still not happy with including %s at all. Otherwise
 it's accept-worthy. (How's that for pressure. :-)


 But if we only add %b and leave out %s then how is this going to lead to
 Python 2/3 compatible code since %b is not in Python 2? Or am I
 misunderstanding you?

 -Brett




  On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman et...@stoneleaf.uswrote:

 On 03/27/2014 10:55 AM, Ethan Furman wrote:

 On 03/27/2014 10:29 AM, Guido van Rossum wrote:


 I also don't understand why we can't use %b instead of %s. AFAIK %b
 currently doesn't mean anything and I somehow don't
 expect we're likely to add it for other reasons (unless there's a
 proposal I'm missing?). Just like we use %a instead of
 %r to remind people that it's not quite the same (since it applies
 .encode('ascii', 'backslashreplace')), shouldn't we
 use anything *but* %s to remind people that that is also not the same
 (not at all, in fact)? The PEP's argument against
 %b (rejected as not adding any value either in clarity or
 simplicity) is hardly a good reason.


 The biggest reason to use %s is to support a common code base for 2/3
 endeavors.  The biggest reason to not include %b
 is that it means binary number in format(); given that each type can
 invent it's own mini-language, this probably isn't
 a very strong argument against it.

 I have moderate feelings for keeping %s as a synonym for %b for
 backwards compatibility with Py2 code (when it's
 appropriate).


 Changed to:
 
 --
 ``%b`` will insert a series of bytes.  These bytes are collected in one
 of two
 ways:

   - input type supports ``Py_buffer`` [4]_?

 use it to collect the necessary bytes

   - input type is something else?
 use its ``__bytes__`` method [5]_ ; if there isn't one, raise a
 ``TypeError``

 In particular, ``%b`` will not accept numbers nor ``str``.  ``str`` is
 rejected
 as the string to bytes conversion requires an encoding, and we are
 refusing to
 guess; numbers are rejected because:

   - what makes a number is fuzzy (float? Decimal? Fraction? some user
 type?)

   - allowing numbers would lead to ambiguity between numbers and textual
 representations of numbers (3.14 vs '3.14')

   - given the nature of wire formats, explicit is definitely better than
 implicit

 ``%s`` is included as a synonym for ``%b`` for the sole purpose of
 making 2/3 code
 bases easier to maintain.  Python 3 only code should use ``%b``.

 Examples::

  b'%b' % b'abc'
 b'abc'

  b'%b' % 'some string'.encode('utf8')
 b'some string'

  b'%b' % 3.14

 Traceback (most recent call last):
 ...
 TypeError: b'%b' does not accept 'float'

  b'%b' % 'hello world!'

 Traceback (most recent call last):
 ...
 TypeError: b'%b' does not accept 'str'
 
 --



 --
 ~Ethan~
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev

  Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 guido%40python.org




 --
 --Guido van Rossum (python.org/~guido)
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 brett%40python.org




-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 11:24 AM, Guido van Rossum wrote:

On Thu, Mar 27, 2014 at 10:55 AM, Ethan Furman wrote:


The biggest reason to use %s is to support a common code base for 2/3 endeavors.


But it's mostly useless for that purpose. In Python 2, in practice %s doesn't mean 
string. [...]


In Python 2 if one is using 'str' as a 'bytes' container, and doing interpolation, %s is the only choice available for 
other 'bytes' (aka other 'str's).  Note that I'm happy to be proven wrong on this point.  :)


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
On Thu, Mar 27, 2014 at 11:52 AM, Daniel Holth dho...@gmail.com wrote:

 I feel not including %s is nuts. Should I write .replace('%b', '%s')?


I assume you meant .replace('%s', '%b') (unless you're converting Python 3
code to Python 2, which would mean you really are nuts :-).

But that's not going to help for the majority of code using %s -- as I am
trying to argue, %s doesn't mean expect the argument to be a str and
neither is that how it's commonly used (although it's *possible* that that
is how *you* use it exclusively -- that doesn't make you nuts, just more
strict than most people).


 All I desperately need are APIs that provide enough unicode / str type
 safety that I get an exception when mixing them accidentally... in my
 own code, dynamic typing is usually a bug. As has been endlessly
 discussed, %s for bytes is a bit like exposing sprintf()...


I don't understand that last claim (I can't figure out whether in this
context is exposing sprintf() is considered good or bad). But apart from
that, can you give some specific examples?

PS. I am not trying to be difficult. I honestly don't understand the use
case yet, and the PEP doesn't do much to support it.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
On Thu, Mar 27, 2014 at 11:34 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 03/27/2014 11:24 AM, Guido van Rossum wrote:

 On Thu, Mar 27, 2014 at 10:55 AM, Ethan Furman wrote:


 The biggest reason to use %s is to support a common code base for 2/3
 endeavors.


 But it's mostly useless for that purpose. In Python 2, in practice %s
 doesn't mean string. [...]


 In Python 2 if one is using 'str' as a 'bytes' container, and doing
 interpolation, %s is the only choice available for other 'bytes' (aka other
 'str's).  Note that I'm happy to be proven wrong on this point.  :)


That is true. And we can't change Python 2. I still have this idea in my
head that *most* cases where %s is used in Python 2 will break in Python 3
under the PEP's rules, but perhaps they are not the majority of situations
where the context is manipulating bytes. And I suppose that *very* few
internet protocols are designed to accept either an integer or the literal
string None, so that use case (which I brought up) isn't very realistic --
in fact it may be better to raise an exception rather than sending a
protocol violation.

So, I think you have changed my mind. I still like the idea of promoting %b
in pure Python 3 code to emphasize that it really behaves very differently
from %s; but I now have peace with %s as an alias. (It might also benefit
cases where somehow there's a symmetry in some Python 3 code between bytes
and str.)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Glenn Linderman

On 3/27/2014 11:59 AM, Guido van Rossum wrote:
On Thu, Mar 27, 2014 at 11:52 AM, Daniel Holth dho...@gmail.com 
mailto:dho...@gmail.com wrote:


I feel not including %s is nuts. Should I write .replace('%b', '%s')?


I assume you meant .replace('%s', '%b') (unless you're converting 
Python 3 code to Python 2, which would mean you really are nuts :-).


But that's not going to help for the majority of code using %s -- as I 
am trying to argue, %s doesn't mean expect the argument to be a str 
and neither is that how it's commonly used (although it's *possible* 
that that is how *you* use it exclusively -- that doesn't make you 
nuts, just more strict than most people).


That _is_ how it is commonly used in Py2 when dealing with binary data 
in mixed ASCII/binary protocols, is what I've been hearing in this 
discussion, and what small use I've made of Py2 when some unported 
module forced me to use it (I started Python about the time Py3 was 
released)... the expected argument is a (Py2) str containing binary data 
(would be bytes in Py3).


While there are many other reasons to use %s in other coding situations, 
this is the only way to do bytes interpolations using %. And there is no 
%b in Py2, so for Py2/3 compatibility, %s needs to do bytes 
interpolations in Py3. And if it does, there is no need for %b in Py3 %, 
because they would be identical and redundant.



All I desperately need are APIs that provide enough unicode / str type
safety that I get an exception when mixing them accidentally... in my
own code, dynamic typing is usually a bug. As has been endlessly
discussed, %s for bytes is a bit like exposing sprintf()...


I don't understand that last claim (I can't figure out whether in this 
context is exposing sprintf() is considered good or bad). But apart 
from that, can you give some specific examples?


PS. I am not trying to be difficult. I honestly don't understand the 
use case yet, and the PEP doesn't do much to support it.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Daniel Holth
On Thu, Mar 27, 2014 at 2:53 PM, Guido van Rossum gu...@python.org wrote:
 So what's the use case for Python 2/3 compatible code? IMO the main use case
 for the PEP is simply to be able to construct bytes from a combination of a
 template and some input that may include further bytes and numbers. E.g. in
 asyncio when you write an HTTP client or server you have to construct bytes
 to write to the socket, and I'd be happy if I could write b'HTTP/1.0 %d
 %b\r\n' % (status, message) rather than having to use
 str(status).encode('ascii') and concatenation or join().

It seems to be notoriously difficult to understand or explain why
Unicode can still be very hard in Python 3 or in code that is in the
middle of being ported or has to run in both interpreters. As far as I
can tell part of it is when a symbol has type(str or bytes) depending
(declared as if we had a static type system with union types); some of
it is because incorrect mixing can happen without an exception, only
to be discovered later and far away in space and time from the error
(worse of all in a serialized file), and part of it is all of the not
easily checkable types a particular Unicode object has depending on
whether it contains surrogates or codes  n. Sometimes you might
simply disagree about whether an API should be returning bytes or
Unicode in mildly ambiguous cases like base64 encoding. Sometimes
Unicode is just intrinsically complicated.

For me this PEP holds the promise of being able to do work in the
bytes domain, with no accidental mixing ever, when I *really* want
bytes. For 2+3 I would get exceptions sometimes in Python 2 and
exceptions all the time in Python 3 for mistakes. I hope this is less
error prone in strict domains than for example ustring
processing.encode('latin1'). And I hope that there is very little
type(str or int) in HTTP for example or other legitimate bytes
domains but I don't know; I suspect that if you have a lot of problems
with bytes' %s then it's a clue you should use (u%s %
(argument)).encode() instead.

sprintf()'s version of %s just takes a char* and puts it in without
doing any type conversion of course. IANACL (I am not a C lawyer).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 11:53 AM, Guido van Rossum wrote:


So what's the use case for Python 2/3 compatible code? IMO the main use case 
for the PEP is simply to be able to
construct bytes from a combination of a template and some input that may 
include further bytes and numbers. E.g. in
asyncio when you write an HTTP client or server you have to construct bytes to 
write to the socket, and I'd be happy if
I could write b'HTTP/1.0 %d %b\r\n' % (status, message) rather than having to 
use str(status).encode('ascii') and
concatenation or join().


My own dbf module [1] would make use of this feature, and I'm sure some of the pdf modules would as well (I recall 
somebody chiming in about their own pdf module).


--
~Ethan~

[1] https://pypi.python.org/pypi/dbf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 11:59 AM, Guido van Rossum wrote:


PS. I am not trying to be difficult. I honestly don't understand the use case 
yet, and the PEP doesn't do much to
support it.


How's this?

Compatibility with Python 2
===

As noted above, ``%s`` is being included solely to help ease migration from,
and/or have a single code base with, Python 2.  This is important as there
are modules both in the wild and behind closed doors that currently use the
Python 2 ``str`` type as a ``bytes`` container, and hence are using ``%s``
as a bytes interpolator.

However, ``%b`` should be used in new, Python 3 only code, so ``%s`` will
immediately be deprecated, but not removed until the next major Python
release.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Guido van Rossum
I love it!


On Thu, Mar 27, 2014 at 12:11 PM, Ethan Furman et...@stoneleaf.us wrote:

 On 03/27/2014 11:59 AM, Guido van Rossum wrote:


 PS. I am not trying to be difficult. I honestly don't understand the use
 case yet, and the PEP doesn't do much to
 support it.


 How's this?
 
 
 Compatibility with Python 2
 ===

 As noted above, ``%s`` is being included solely to help ease migration
 from,
 and/or have a single code base with, Python 2.  This is important as there
 are modules both in the wild and behind closed doors that currently use the
 Python 2 ``str`` type as a ``bytes`` container, and hence are using ``%s``
 as a bytes interpolator.

 However, ``%b`` should be used in new, Python 3 only code, so ``%s`` will
 immediately be deprecated, but not removed until the next major Python
 release.

 
 

 --
 ~Ethan~
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 guido%40python.org




-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 11:41 AM, Guido van Rossum wrote:

Much better, but I'm still not happy with including %s at all. Otherwise it's 
accept-worthy. (How's that for pressure. :-)


FWIW, I feel the same, but the need for compatible 2/3 code bases is real.

Hey, how's this?  We'll let %s in, but immediately deprecate it.  ;)  Of 
course, we won't remove it until Python IV.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Antoine Pitrou
On Thu, 27 Mar 2014 18:47:59 +
Brett Cannon bcan...@gmail.com wrote:
 On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org wrote:
 
  Much better, but I'm still not happy with including %s at all. Otherwise
  it's accept-worthy. (How's that for pressure. :-)
 
 
 But if we only add %b and leave out %s then how is this going to lead to
 Python 2/3 compatible code since %b is not in Python 2? Or am I
 misunderstanding you?

I think we have reached a point where adding porting-related facilities
in 3.5 may actually slow down the pace of porting, rather than
accelerate it (because people will then wait for 3.5 to start porting
stuff).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Antoine Pitrou
On Thu, 27 Mar 2014 11:57:35 -0700
Ethan Furman et...@stoneleaf.us wrote:
 On 03/27/2014 11:41 AM, Guido van Rossum wrote:
  Much better, but I'm still not happy with including %s at all. Otherwise 
  it's accept-worthy. (How's that for pressure. :-)
 
 FWIW, I feel the same, but the need for compatible 2/3 code bases is real.
 
 Hey, how's this?  We'll let %s in, but immediately deprecate it.  ;)  Of 
 course, we won't remove it until Python IV.

I vote for an environment variable-controlled feature activation
(or with a registry key under Windows) ;)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Greg Ewing

R. David Murray wrote:

I've done the 'landmark' thing as well, in the string context; that can be
very useful when doing incremental test driven development.  (Granted, you
could do that with __bytes__;


Can't you do it more easily just by wrapping ascii()
around the argument? That seems sufficient for debugging
purposes to me.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Ethan Furman

On 03/27/2014 03:10 PM, Greg Ewing wrote:

R. David Murray wrote:

I've done the 'landmark' thing as well, in the string context; that can be
very useful when doing incremental test driven development.  (Granted, you
could do that with __bytes__;


Can't you do it more easily just by wrapping ascii()
around the argument? That seems sufficient for debugging
purposes to me.


The problem there is ascii() still returns unicode (okay, okay, str), so you 
still have to encode it.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-26 Thread Victor Stinner
2014-03-25 23:37 GMT+01:00 Ethan Furman et...@stoneleaf.us:
 ``%a`` will call ``ascii()`` on the interpolated value.

I'm not sure that I understood correctly: is the %a format
supported? The result of ascii() is a Unicode string. Does it mean
that (%a % obj) should give the same result than
ascii(obj).encode('ascii', 'strict')?

Would it be possible to add a table or list to summarize supported
format characters? I found:

- single byte: %c
- integer: %d, %u, %i, %o, %x, %X, %f, %g, etc. (can you please
complete etc. ?)
- bytes and __bytes__ method: %s
- ascii(): %a


I guess that the implementation of %a can avoid a conversion from
ASCII (PyUnicode_DecodeASCII in the following code) and then a
conversion to ASCII again (in bytes%args):

PyObject *
PyObject_ASCII(PyObject *v)
{
PyObject *repr, *ascii, *res;

repr = PyObject_Repr(v);
if (repr == NULL)
return NULL;

if (PyUnicode_IS_ASCII(repr))
return repr;

/* repr is guaranteed to be a PyUnicode object by PyObject_Repr */
ascii = _PyUnicode_AsASCIIString(repr, backslashreplace);
Py_DECREF(repr);
if (ascii == NULL)
return NULL;

res = PyUnicode_DecodeASCII(    HERE
PyBytes_AS_STRING(ascii),
PyBytes_GET_SIZE(ascii),
NULL);

Py_DECREF(ascii);
return res;
}

  This is intended
 as a debugging aid, rather than something that should be used in production.

I don't understand the purpose of this sentence. Does it mean that %a
must not be used? IMO this sentence can be removed.

 Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
 representation.

Unicode is larger than that! print(ascii(chr(0x10))) = '\U0010'

 Use cases include developing a new protocol and writing
 landmarks into the stream; debugging data going into an existing protocol
 to see if the problem is the protocol itself or bad data; a fall-back for a
 serialization format; or even a rudimentary serialization format when
 defining ``__bytes__`` would not be appropriate [8].

I understand the debug use case. I'm not convinced by the serialization idea :-)

 .. note::

 If a ``str`` is passed into ``%a``, it will be surrounded by quotes.

And:

- bytes gets a b prefix and surrounded by quotes as well  (b'...')
- the quote ' is escaped as \' if the string contains quotes ' and 

Can you also please add examples for %a?

 b%a % 123
b'123'
 b%s % ascii(bbytes)
bb'bytes'
 b%s % text   # hum, it's not easy to see surrounding quotes with this 
 examples
b'text'

The following more complex examples are maybe not needed:

 b%a % euro:€
b'euro:\\u20ac'
 b%a % quotes '
b'\'quotes \\\'\''

 Proposed variations
 ===


It would be fair to mention also a whole different PEP, Antoine's PEP 460!

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-26 Thread Ethan Furman

On 03/26/2014 03:10 AM, Victor Stinner wrote:

2014-03-25 23:37 GMT+01:00 Ethan Furman:


``%a`` will call ``ascii()`` on the interpolated value.


I'm not sure that I understood correctly: is the %a format
supported? The result of ascii() is a Unicode string. Does it mean
that (%a % obj) should give the same result than
ascii(obj).encode('ascii', 'strict')?


Changed to:
---
``%a`` will give the equivalent of
``repr(some_obj).encode('ascii', 'backslashreplace')`` on the interpolated
value.  Use cases include developing a new protocol and writing landmarks
into the stream; debugging data going into an existing protocol to see if
the problem is the protocol itself or bad data; a fall-back for a serialization
format; or any situation where defining ``__bytes__`` would not be appropriate
but a readable/informative representation is needed [8].
---



Would it be possible to add a table or list to summarize supported
format characters? I found:

- single byte: %c
- integer: %d, %u, %i, %o, %x, %X, %f, %g, etc. (can you please
complete etc. ?)
- bytes and __bytes__ method: %s
- ascii(): %a


Changed to:
---
%-interpolation
---

All the numeric formatting codes (``d``, ``i``, ``o``, ``u``, ``x``, ``X``,
``e``, ``E'', ``f``, ``F``, ``g``, ``G``, and any that are subsequently added
to Python 3) will be supported, and will work as they do for str, including
the padding, justification and other related modifiers (currently ``#``, ``0``,
``-``, `` `` (space), and ``+`` (plus any added to Python 3)).  The only
non-numeric codes allowed are ``c``, ``s``, and ``a``.

For the numeric codes, the only difference between ``str`` and ``bytes`` (or
``bytearray``) interpolation is that the results from these codes will be
ASCII-encoded text, not unicode.  In other words, for any numeric formatting
code `%x`::
---



I don't understand the purpose of this sentence. Does it mean that %a
must not be used? IMO this sentence can be removed.


The sentence about %a being for debugging has been removed.



Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
representation.


Unicode is larger than that! print(ascii(chr(0x10))) = '\U0010'


Removed.  With the explicit reference to the 'backslashreplace' error handler any who want to know what it might look 
like can refer to that.




.. note::

 If a ``str`` is passed into ``%a``, it will be surrounded by quotes.


And:

- bytes gets a b prefix and surrounded by quotes as well  (b'...')
- the quote ' is escaped as \' if the string contains quotes ' and 


Shouldn't be an issue now with the new definition which no longer references 
the ascii() function.



Can you also please add examples for %a?

---
Examples::

 b'%a' % 3.14
b'3.14'

 b'%a' % b'abc'
b'abc'

 b'%a' % 'def'
b'def'
---



Proposed variations
===



It would be fair to mention also a whole different PEP, Antoine's PEP 460!


My apologies for the omission.
---
A competing PEP, ``PEP 460 Add binary interpolation and formatting`` [9], also
exists.

.. [9] http://python.org/dev/peps/pep-0460/
---

Thank you, Victor.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-26 Thread Thomas Wouters
On Tue, Mar 25, 2014 at 11:37 PM, Ethan Furman et...@stoneleaf.us wrote:

 In particular, ``%s`` will not accept numbers (use a numeric format code
 for
 that), nor ``str`` (encode it to ``bytes``).


I don't understand this restriction, and there isn't a rationale for it in
the PEP (other than you can already use numeric formats, which doesn't
explain why it's undesirable to have it anyway.) It is extremely common in
existing 2.x code to use %s for anything, just like people use {} for
anything with str.format. Not supporting this feels like it would be
problematic for porting code. Did this come up in the earlier discussions?

-- 
Thomas Wouters tho...@python.org

Hi! I'm an email virus! Think twice before sending your email to help me
spread!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-26 Thread Ethan Furman

On 03/26/2014 08:14 AM, Thomas Wouters wrote:

On Tue, Mar 25, 2014 at 11:37 PM, Ethan Furman wrote:

In particular, ``%s`` will not accept numbers (use a numeric format code for
that), nor ``str`` (encode it to ``bytes``).


I don't understand this restriction, and there isn't a rationale for it in the PEP 
(other than you can already use
numeric formats, which doesn't explain why it's undesirable to have it 
anyway.) It is extremely common in existing 2.x
code to use %s for anything


And that's the problem -- in 2.x %s works always, but 3.x for bytes and 
bytearray %s will fail in numerous situations.

It seems to me the main reason for using %s instead of %d is that 'some_var' may have a number, or it may have the 
textual representation of that number; in 3.x the first would succeed, the second would fail.  That's the kind of 
intermittent failure we do not want.


The PEP is not designed to make it so 2.x code can be ported as-is, but rather that 2.x code can be cleaned up (if 
necessary) and then run the same in both 2.x and 3.x (at least as far as byte and bytearray %-formatting is concerned).



 Did this come up in the earlier discussions?


https://mail.python.org/pipermail/python-dev/2014-January/131576.html

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-26 Thread Victor Stinner
2014-03-26 15:35 GMT+01:00 Ethan Furman et...@stoneleaf.us:
 ---
 Examples::

  b'%a' % 3.14
 b'3.14'

  b'%a' % b'abc'
 b'abc'

This one is wrong:

 repr(b'abc').encode('ascii', 'backslashreplace')
bb'abc'

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-26 Thread Ethan Furman

On 03/26/2014 02:41 PM, Victor Stinner wrote:

2014-03-26 15:35 GMT+01:00 Ethan Furman et...@stoneleaf.us:

---
Examples::

  b'%a' % 3.14
 b'3.14'

  b'%a' % b'abc'
 b'abc'


This one is wrong:


repr(b'abc').encode('ascii', 'backslashreplace')

bb'abc'


Fixed, thanks.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-25 Thread Ethan Furman

Okay, I included that last round of comments (from late February).

Barring typos, this should be the final version.

Final comments?

-
PEP: 461
Title: Adding % formatting to bytes and bytearray
Version: $Revision$
Last-Modified: $Date$
Author: Ethan Furman et...@stoneleaf.us
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2014-01-13
Python-Version: 3.5
Post-History: 2014-01-14, 2014-01-15, 2014-01-17, 2014-02-22, 2014-03-25
Resolution:


Abstract


This PEP proposes adding % formatting operations similar to Python 2's ``str``
type to ``bytes`` and ``bytearray`` [1]_ [2]_.


Rationale
=

While interpolation is usually thought of as a string operation, there are
cases where interpolation on ``bytes`` or ``bytearrays`` make sense, and the
work needed to make up for this missing functionality detracts from the overall
readability of the code.


Motivation
==

With Python 3 and the split between ``str`` and ``bytes``, one small but
important area of programming became slightly more difficult, and much more
painful -- wire format protocols [3]_.

This area of programming is characterized by a mixture of binary data and
ASCII compatible segments of text (aka ASCII-encoded text).  Bringing back a
restricted %-interpolation for ``bytes`` and ``bytearray`` will aid both in
writing new wire format code, and in porting Python 2 wire format code.

Common use-cases include ``dbf`` and ``pdf`` file formats, ``email``
formats, and ``FTP`` and ``HTTP`` communications, among many others.


Proposed semantics for ``bytes`` and ``bytearray`` formatting
=

%-interpolation
---

All the numeric formatting codes (such as ``%x``, ``%o``, ``%e``, ``%f``,
``%g``, etc.) will be supported, and will work as they do for str, including
the padding, justification and other related modifiers.  The only difference
will be that the results from these codes will be ASCII-encoded text, not
unicode.  In other words, for any numeric formatting code `%x`::

   b%x % val

is equivalent to

   (%x % val).encode(ascii)

Examples::

b'%4x' % 10
   b'   a'

b'%#4x' % 10
   ' 0xa'

b'%04X' % 10
   '000A'

``%c`` will insert a single byte, either from an ``int`` in range(256), or from
a ``bytes`` argument of length 1, not from a ``str``.

Examples::

 b'%c' % 48
b'0'

 b'%c' % b'a'
b'a'

``%s`` is included for two reasons:  1) `b` is already a format code for
``format`` numerics (binary), and 2) it will make 2/3 code easier as Python 2.x
code uses ``%s``; however, it is restricted in what it will accept::

  - input type supports ``Py_buffer`` [6]_?
use it to collect the necessary bytes

  - input type is something else?
use its ``__bytes__`` method [7]_ ; if there isn't one, raise a 
``TypeError``

In particular, ``%s`` will not accept numbers (use a numeric format code for
that), nor ``str`` (encode it to ``bytes``).

Examples::

 b'%s' % b'abc'
b'abc'

 b'%s' % 'some string'.encode('utf8')
b'some string'

 b'%s' % 3.14
Traceback (most recent call last):
...
TypeError: b'%s' does not accept numbers, use a numeric code instead

 b'%s' % 'hello world!'
Traceback (most recent call last):
...
TypeError: b'%s' does not accept 'str', it must be encoded to `bytes`


``%a`` will call ``ascii()`` on the interpolated value.  This is intended
as a debugging aid, rather than something that should be used in production.
Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
representation.  Use cases include developing a new protocol and writing
landmarks into the stream; debugging data going into an existing protocol
to see if the problem is the protocol itself or bad data; a fall-back for a
serialization format; or even a rudimentary serialization format when
defining ``__bytes__`` would not be appropriate [8].

.. note::

If a ``str`` is passed into ``%a``, it will be surrounded by quotes.


Unsupported codes
-

``%r`` (which calls ``__repr__`` and returns a ``str``) is not supported.


Proposed variations
===

It was suggested to let ``%s`` accept numbers, but since numbers have their own
format codes this idea was discarded.

It has been suggested to use ``%b`` for bytes as well as ``%s``.  This was
rejected as not adding any value either in clarity or simplicity.

It has been proposed to automatically use ``.encode('ascii','strict')`` for
``str`` arguments to ``%s``.

  - Rejected as this would lead to intermittent failures.  Better to have the
operation always fail so the trouble-spot can be correctly fixed.

It has been proposed to have ``%s`` return the ascii-encoded repr when the
value is a ``str`` (b'%s' % 'abc'  -- b'abc').

  - Rejected as this would lead to hard to debug failures far from the problem
site.  Better to 

Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-25 Thread Daniel Holth
I love it.

On Tue, Mar 25, 2014 at 6:37 PM, Ethan Furman et...@stoneleaf.us wrote:
 Okay, I included that last round of comments (from late February).

 Barring typos, this should be the final version.

 Final comments?

 -
 PEP: 461
 Title: Adding % formatting to bytes and bytearray
 Version: $Revision$
 Last-Modified: $Date$
 Author: Ethan Furman et...@stoneleaf.us
 Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 2014-01-13
 Python-Version: 3.5
 Post-History: 2014-01-14, 2014-01-15, 2014-01-17, 2014-02-22, 2014-03-25
 Resolution:


 Abstract
 

 This PEP proposes adding % formatting operations similar to Python 2's
 ``str``
 type to ``bytes`` and ``bytearray`` [1]_ [2]_.


 Rationale
 =

 While interpolation is usually thought of as a string operation, there are
 cases where interpolation on ``bytes`` or ``bytearrays`` make sense, and the
 work needed to make up for this missing functionality detracts from the
 overall
 readability of the code.


 Motivation
 ==

 With Python 3 and the split between ``str`` and ``bytes``, one small but
 important area of programming became slightly more difficult, and much more
 painful -- wire format protocols [3]_.

 This area of programming is characterized by a mixture of binary data and
 ASCII compatible segments of text (aka ASCII-encoded text).  Bringing back a
 restricted %-interpolation for ``bytes`` and ``bytearray`` will aid both in
 writing new wire format code, and in porting Python 2 wire format code.

 Common use-cases include ``dbf`` and ``pdf`` file formats, ``email``
 formats, and ``FTP`` and ``HTTP`` communications, among many others.


 Proposed semantics for ``bytes`` and ``bytearray`` formatting
 =

 %-interpolation
 ---

 All the numeric formatting codes (such as ``%x``, ``%o``, ``%e``, ``%f``,
 ``%g``, etc.) will be supported, and will work as they do for str, including
 the padding, justification and other related modifiers.  The only difference
 will be that the results from these codes will be ASCII-encoded text, not
 unicode.  In other words, for any numeric formatting code `%x`::

b%x % val

 is equivalent to

(%x % val).encode(ascii)

 Examples::

 b'%4x' % 10
b'   a'

 b'%#4x' % 10
' 0xa'

 b'%04X' % 10
'000A'

 ``%c`` will insert a single byte, either from an ``int`` in range(256), or
 from
 a ``bytes`` argument of length 1, not from a ``str``.

 Examples::

  b'%c' % 48
 b'0'

  b'%c' % b'a'
 b'a'

 ``%s`` is included for two reasons:  1) `b` is already a format code for
 ``format`` numerics (binary), and 2) it will make 2/3 code easier as Python
 2.x
 code uses ``%s``; however, it is restricted in what it will accept::

   - input type supports ``Py_buffer`` [6]_?
 use it to collect the necessary bytes

   - input type is something else?
 use its ``__bytes__`` method [7]_ ; if there isn't one, raise a
 ``TypeError``

 In particular, ``%s`` will not accept numbers (use a numeric format code for
 that), nor ``str`` (encode it to ``bytes``).

 Examples::

  b'%s' % b'abc'
 b'abc'

  b'%s' % 'some string'.encode('utf8')
 b'some string'

  b'%s' % 3.14
 Traceback (most recent call last):
 ...
 TypeError: b'%s' does not accept numbers, use a numeric code instead

  b'%s' % 'hello world!'
 Traceback (most recent call last):
 ...
 TypeError: b'%s' does not accept 'str', it must be encoded to `bytes`


 ``%a`` will call ``ascii()`` on the interpolated value.  This is intended
 as a debugging aid, rather than something that should be used in production.
 Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
 representation.  Use cases include developing a new protocol and writing
 landmarks into the stream; debugging data going into an existing protocol
 to see if the problem is the protocol itself or bad data; a fall-back for a
 serialization format; or even a rudimentary serialization format when
 defining ``__bytes__`` would not be appropriate [8].

 .. note::

 If a ``str`` is passed into ``%a``, it will be surrounded by quotes.


 Unsupported codes
 -

 ``%r`` (which calls ``__repr__`` and returns a ``str``) is not supported.


 Proposed variations
 ===

 It was suggested to let ``%s`` accept numbers, but since numbers have their
 own
 format codes this idea was discarded.

 It has been suggested to use ``%b`` for bytes as well as ``%s``.  This was
 rejected as not adding any value either in clarity or simplicity.

 It has been proposed to automatically use ``.encode('ascii','strict')`` for
 ``str`` arguments to ``%s``.

   - Rejected as this would lead to intermittent failures.  Better to have
 the
 operation always fail so the trouble-spot can be correctly fixed.

 It has been proposed to