Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Mar 29, 2014, at 2:53 PM, Gregory P. Smith g...@krypto.org wrote: On Thu, Mar 27, 2014 at 3:05 PM, Antoine Pitrou solip...@pitrou.net wrote: On Thu, 27 Mar 2014 18:47:59 + Brett Cannon bcan...@gmail.com wrote: On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) But if we only add %b and leave out %s then how is this going to lead to Python 2/3 compatible code since %b is not in Python 2? Or am I misunderstanding you? I think we have reached a point where adding porting-related facilities in 3.5 may actually slow down the pace of porting, rather than accelerate it (because people will then wait for 3.5 to start porting stuff). I understand that sentiment but that is an unjustified fear. It is not a good reason not to do it. Projects are already trying to port stuff today and running into roadblocks when it comes to ascii-compatible bytes formatting for real world data formats in code needing to be 2.x compatible. I'm pulling out my practicality beats purity card here. Mercurial is one of the large Python 2.4-2.7 code bases that needs this feature in order to support Python 3 in a sane manner. (+Augie Fackler to look at the latest http://legacy.python.org/dev/peps/pep-0461/ to confirm usefulness) That looks sufficient to me - the biggest thing is being able to do abort: %s is broken % some_filename_that_is_bytes and have that work sanely, as well as the numerics. This looks like exactly what we need, but I'd love to test it soon (I'm happy to build a 3.5 from tip for testing) so that if it's not Right[0] changes can be made before it's permanent. Feel encouraged to CC me on patches or something for testing (or mail me directly when it lands). Thanks! AF -gps signature.asc Description: Message signed with OpenPGP using GPGMail ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 4/12/2014 11:08 AM, Augie Fackler wrote: On Mar 29, 2014, at 2:53 PM, Gregory P. Smith g...@krypto.org mailto:g...@krypto.org wrote: On Thu, Mar 27, 2014 at 3:05 PM, Antoine Pitrou solip...@pitrou.net mailto:solip...@pitrou.net wrote: On Thu, 27 Mar 2014 18:47:59 + Brett Cannon bcan...@gmail.com mailto:bcan...@gmail.com wrote: On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org mailto:gu...@python.org wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) But if we only add %b and leave out %s then how is this going to lead to Python 2/3 compatible code since %b is not in Python 2? Or am I misunderstanding you? I think we have reached a point where adding porting-related facilities in 3.5 may actually slow down the pace of porting, rather than accelerate it (because people will then wait for 3.5 to start porting stuff). I understand that sentiment but that is an unjustified fear. It is not a good reason not to do it. Projects are already trying to port stuff today and running into roadblocks when it comes to ascii-compatible bytes formatting for real world data formats in code needing to be 2.x compatible. I'm pulling out my practicality beats purity card here. Mercurial is one of the large Python 2.4-2.7 code bases that needs this feature in order to support Python 3 in a sane manner. (+Augie Fackler to look at the latest http://legacy.python.org/dev/peps/pep-0461/ to confirm usefulness) That looks sufficient to me - the biggest thing is being able to do abort: %s is broken % some_filename_that_is_bytes and have that work sanely, as well as the numerics. This looks like exactly what we need, but I'd love to test it soon (I'm happy to build a 3.5 from tip for testing) so that if it's not Right[0] changes can be made before it's permanent. Feel encouraged to CC me on patches or something for testing (or mail me directly when it lands). Add yourself as nosy to http://bugs.python.org/issue20284 patch to implement PEP 461 (%-interpolation for bytes) Indeed, you could help test it the latest version, and others as posted. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, Mar 27, 2014 at 3:05 PM, Antoine Pitrou solip...@pitrou.net wrote: On Thu, 27 Mar 2014 18:47:59 + Brett Cannon bcan...@gmail.com wrote: On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) But if we only add %b and leave out %s then how is this going to lead to Python 2/3 compatible code since %b is not in Python 2? Or am I misunderstanding you? I think we have reached a point where adding porting-related facilities in 3.5 may actually slow down the pace of porting, rather than accelerate it (because people will then wait for 3.5 to start porting stuff). I understand that sentiment but that is an unjustified fear. It is not a good reason not to do it. Projects are already trying to port stuff today and running into roadblocks when it comes to ascii-compatible bytes formatting for real world data formats in code needing to be 2.x compatible. I'm pulling out my practicality beats purity card here. Mercurial is one of the large Python 2.4-2.7 code bases that needs this feature in order to support Python 3 in a sane manner. (+Augie Fackler to look at the latest http://legacy.python.org/dev/peps/pep-0461/ to confirm usefulness) -gps ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Sat, 29 Mar 2014 11:53:45 -0700 Gregory P. Smith g...@krypto.org wrote: I understand that sentiment but that is an unjustified fear. It is not a good reason not to do it. Projects are already trying to port stuff today and running into roadblocks when it comes to ascii-compatible bytes formatting for real world data formats in code needing to be 2.x compatible. I'm pulling out my practicality beats purity card here. Roadblocks is an unjustified term here. Important code bases such as Tornado have already achieved this a long time ago. While lack of bytes formatting does make porting harder, it is not a roadblock as in you can't work it around. Mercurial is one of the large Python 2.4-2.7 code bases that needs this feature in order to support Python 3 in a sane manner. (+Augie Fackler to look at the latest http://legacy.python.org/dev/peps/pep-0461/ to confirm usefulness) http://www.selenic.com/pipermail/mercurial-devel/2014-March/057474.html Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, Mar 27, 2014 at 3:47 AM, Victor Stinner victor.stin...@gmail.comwrote: The PEP 461 looks good to me. It's a nice addition to Python 3.5 and the PEP is well defined. +1 I can help to implement it. Maybe, it would be nice to provide an implementation as a third-party party module on PyPI for Python 2.6-3.4. That is possible and would enable bytes formatting on earlier 3.x versions. I'm not sure if there is any value in backporting to 2.x as those already have such formatting with Python 2's str.__mod__ % operator. Though I don't know what it'd look like as an API as a module. Brainstorming: It'd either involve function calls to format instead of % or a container class to wrap format strings in with a __mod__ method that calls the bytes formatting code instead of native str % formatting when needed. From a 2.x-3.x compatible code standpoint the above could exist but the container class constructor would be a no-op on Python 2. if sys.version_info[0] == 2: BytesFormatter = str else: class BytesFormatter: ... def __mod__ ... -gps Note: I fixed a typo in your PEP (reST syntax). Victor 2014-03-26 23:47 GMT+01:00 Ethan Furman et...@stoneleaf.us: This one is wrong: repr(b'abc').encode('ascii', 'backslashreplace') bb'abc' Fixed, thanks. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/greg%40krypto.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 3/29/2014 12:01 PM, Gregory P. Smith wrote: From a 2.x-3.x compatible code standpoint the above could exist but the container class constructor would be a no-op on Python 2. if sys.version_info[0] == 2: BytesFormatter = str else: class BytesFormatter: ... def __mod__ ... If done as a container class, the Python 2 version should implement the same restrictions on %s for numerics, and implement %b. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/29/2014 11:59 AM, Antoine Pitrou wrote: On Sat, 29 Mar 2014 11:53:45 -0700 Gregory P. Smith wrote: I understand that sentiment but that is an unjustified fear. It is not a good reason not to do it. Projects are already trying to port stuff today and running into roadblocks when it comes to ascii-compatible bytes formatting for real world data formats in code needing to be 2.x compatible. I'm pulling out my practicality beats purity card here. Roadblocks is an unjustified term here. It's actually quite appropriate: to get around a physical roadblock you would have to leave the road, forge through lumpy ground and stinging nettles, and then get back on the road. A very good analogy, actually. ;) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 30 March 2014 07:01, Ethan Furman et...@stoneleaf.us wrote: On 03/29/2014 11:59 AM, Antoine Pitrou wrote: On Sat, 29 Mar 2014 11:53:45 -0700 Gregory P. Smith wrote: I understand that sentiment but that is an unjustified fear. It is not a good reason not to do it. Projects are already trying to port stuff today and running into roadblocks when it comes to ascii-compatible bytes formatting for real world data formats in code needing to be 2.x compatible. I'm pulling out my practicality beats purity card here. Roadblocks is an unjustified term here. It's actually quite appropriate: to get around a physical roadblock you would have to leave the road, forge through lumpy ground and stinging nettles, and then get back on the road. A very good analogy, actually. ;) I tend to call them barriers to migration. Up to Python 3.4, my focus has been more on general barriers to entry for Python 3 that applied as much or more to new users as they did to existing ones - hence working on getting pip incorporated, providing a better path to mastery for the codec system, helping Larry with Argument Clinic, helping Eric with the simpler import customisation, trying to help improve the integration with the POSIX text model, assorted tweaks to make the type system more accessible etc. I think Python 3.4 is now in a pretty good place on that front, particularly with Larry stating up front that he considers the ongoing rollout of Argument Clinic usage to be in scope for Python 3.4.x maintenance releases. So for 3.5, I think it makes sense to focus on those barriers to migration and other activities that benefit existing Python 2 users more so than users that are completely new to Python and starting directly with Python 3. Binary interpolation is a big one (thanks Ethan!), as is the proposed policy change to allow network security features to evolve within Python 2.7 maintenance releases. Our community has done a lot of work to support us in our goal of modernising and migrating a large fraction of the ecosystem to a new version of the language, even though the full implications of the revised models for binary and text data turned out to be more profound than I think any of us realised back in 2006 when Guido first turned the previously hypothetical Py3k into a genuine active effort to create a new revision of the language, better suited to the global nature of the 21st century. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, Mar 27, 2014 at 3:05 PM, Antoine Pitrou solip...@pitrou.net mailto:solip...@pitrou.net wrote: I think we have reached a point where adding porting-related facilities AFAIK, The only porting specific feature is %s as a synonym for %b. Not pretty, but tolerable. Otherwise, I have the impression that the PEP pretty much stands on its own. in 3.5 may actually slow down the pace of porting, rather than accelerate it (because people will then wait for 3.5 to start porting stuff). Or, they should download the source and compile and continue or start porting as soon as the bytes % is added. Having earlier Windows and Mac preview binaries might help a tiny bit. If you are saying that Py3 development should not be driven by Py2 concerns, I agree. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 30 March 2014 05:01, Gregory P. Smith g...@krypto.org wrote: On Thu, Mar 27, 2014 at 3:47 AM, Victor Stinner victor.stin...@gmail.com wrote: The PEP 461 looks good to me. It's a nice addition to Python 3.5 and the PEP is well defined. +1 I can help to implement it. Maybe, it would be nice to provide an implementation as a third-party party module on PyPI for Python 2.6-3.4. That is possible and would enable bytes formatting on earlier 3.x versions. I'm not sure if there is any value in backporting to 2.x as those already have such formatting with Python 2's str.__mod__ % operator. Though I don't know what it'd look like as an API as a module. Brainstorming: It'd either involve function calls to format instead of % or a container class to wrap format strings in with a __mod__ method that calls the bytes formatting code instead of native str % formatting when needed. The future project already contains a full backport of a true bytes type, rather than relying on Python 2 str objects: http://python-future.org/what_else.html#bytes It seems to me that the easiest way to make any forthcoming Python 3.5 enhancements (both binary interpolation and the other cleanups we are discussing over on Python ideas) available to single source 2/3 code bases is to commit to an API freeze for *those particular builtins* early, and then update future accordingly. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 3/29/2014 8:28 PM, Nick Coghlan wrote: The future project already contains a full backport of a true bytes type, rather than relying on Python 2 str objects: http://python-future.org/what_else.html#bytes That project looks really nice! It seems to me that the easiest way to make any forthcoming Python 3.5 enhancements (both binary interpolation and the other cleanups we are discussing over on Python ideas) available to single source 2/3 code bases is to commit to an API freeze for *those particular builtins* early, and then update future accordingly. I agree. I think syntax changes should be in by the first alpha, if not before. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 04:26 AM, Nick Coghlan wrote: On 27 March 2014 20:47, Victor Stinner victor.stin...@gmail.com wrote: The PEP 461 looks good to me. It's a nice addition to Python 3.5 and the PEP is well defined. +1 from me as well. One minor request is that I don't think the rationale for rejecting numbers from %s is complete [...] Changed to - In particular, ``%s`` will not accept numbers nor ``str``. ``str`` is rejected as the string to bytes conversion requires an encoding, and we are refusing to guess; numbers are rejected because: - what makes a number is fuzzy (float? Decimal? Fraction? some user type?) - allowing numbers would lead to ambiguity between numbers and textual representations of numbers (3.14 vs '3.14') - given the nature of wire formats, explicit is definitely better than implicit - Note: I fixed a typo in your PEP (reST syntax). I also committed a couple of markup tweaks, since it seemed easier to just fix them than explain what was broken. Thanks to both of you for that. However, there are also two dead footnotes (4 5), which I have left alone - I'm not sure if the problem is a missing reference, or if the footnote can go away now. Fixed. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
The PEP 461 looks good to me. It's a nice addition to Python 3.5 and the PEP is well defined. I can help to implement it. Maybe, it would be nice to provide an implementation as a third-party party module on PyPI for Python 2.6-3.4. Note: I fixed a typo in your PEP (reST syntax). Victor 2014-03-26 23:47 GMT+01:00 Ethan Furman et...@stoneleaf.us: This one is wrong: repr(b'abc').encode('ascii', 'backslashreplace') bb'abc' Fixed, thanks. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Tue, 25 Mar 2014 15:37:11 -0700 Ethan Furman et...@stoneleaf.us wrote: ``%a`` will call ``ascii()`` on the interpolated value. This is intended as a debugging aid, rather than something that should be used in production. Non-ASCII values will be encoded to either ``\xnn`` or ``\u`` representation. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or even a rudimentary serialization format when defining ``__bytes__`` would not be appropriate [8]. The use cases you are enumerating for %a are chimeric. Did you *actually* do those things in real life, or are you inventing them for the PEP? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 27 March 2014 20:47, Victor Stinner victor.stin...@gmail.com wrote: The PEP 461 looks good to me. It's a nice addition to Python 3.5 and the PEP is well defined. +1 from me as well. One minor request is that I don't think the rationale for rejecting numbers from %s is incomplete - IIRC, the problem there is that the normal path for handling those is the coercion via str() and this proposal deliberately *doesn't* allow that path. That means supporting numbers would mean writing a lot of *additional* code, and that isn't needed since 2/3 compatible code can just be adjusted to use an appropriate numeric code. Note: I fixed a typo in your PEP (reST syntax). I also committed a couple of markup tweaks, since it seemed easier to just fix them than explain what was broken. However, there are also two dead footnotes (4 5), which I have left alone - I'm not sure if the problem is a missing reference, or if the footnote can go away now. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 27 March 2014 21:24, Antoine Pitrou solip...@pitrou.net wrote: On Tue, 25 Mar 2014 15:37:11 -0700 Ethan Furman et...@stoneleaf.us wrote: ``%a`` will call ``ascii()`` on the interpolated value. This is intended as a debugging aid, rather than something that should be used in production. Non-ASCII values will be encoded to either ``\xnn`` or ``\u`` representation. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or even a rudimentary serialization format when defining ``__bytes__`` would not be appropriate [8]. The use cases you are enumerating for %a are chimeric. Did you *actually* do those things in real life, or are you inventing them for the PEP? I'm the one that raised the discourage misuse of __bytes__ concern, so I'd like %a to stay in at least for that reason. %a is a perfectly well defined format code (albeit one you'd only be likely to use while messing about with serialisation protocols, as the PEP describes - for example, if a %b code was ending up producing wrong data, you might switch to %a temporarily to get a better idea of where the bad data was coming from), while using __bytes__ to make %s behave the way %a is defined in the PEP would just be wrong in most cases. I consider %a the preemptive PEP 308 of binary interpolation format codes - in the absence of %a, I'm certain that users would end up abusing __bytes__ and %s to get the same effect, just as they used the known bug magnet that was the and/or hack for a long time in the absence of PEP 308. I also seem to recall Guido saying he liked it, which flipped the discussion from do we have a good rationale for including it? to do we have a good rationale for the BDFL to ignore his instincts?. However, it would be up to Guido to confirm that recollection, and if Guido likes it is part of the reason for inclusion of the %a code, the PEP should mention that explicitly. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, 27 Mar 2014 12:24:49 +0100, Antoine Pitrou solip...@pitrou.net wrote: On Tue, 25 Mar 2014 15:37:11 -0700 Ethan Furman et...@stoneleaf.us wrote: ``%a`` will call ``ascii()`` on the interpolated value. This is intended as a debugging aid, rather than something that should be used in production. Non-ASCII values will be encoded to either ``\xnn`` or ``\u`` representation. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or even a rudimentary serialization format when defining ``__bytes__`` would not be appropriate [8]. The use cases you are enumerating for %a are chimeric. Did you *actually* do those things in real life, or are you inventing them for the PEP? The use cases came from someone else (Jim Jewett?) so you should be asking him, not Ethan :) As for the did you actually do those things in real life, I know I've done the dump the repr into the data (protocol) stream to see what I've really got here debug trick in the string context, so I have no doubt that I will want to do it in the bytes context as well. In fact, it is probably somewhat more likely in the bytes context, since I know I've been in situations with data exchange protocols where I couldn't get console output and setting up logging was much more painful than just dumping the debug data into into the data stream. Or where doing so made it much clearer what was going on than separate logging would. I've done the 'landmark' thing as well, in the string context; that can be very useful when doing incremental test driven development. (Granted, you could do that with __bytes__; you might well be writing a __bytes__ method anyway as the next step, but it *is* more overhead/boilerplate than just starting with %a...and it gets people used to reaching for __bytes__ for the wrong purpose, which is Nick's concern). In theory I can see using %a for serialization in certain limited contexts (I've done that with string repr in private utility scripts), but in practice I doubt that would happen in a binary context, since those are much more likely to be actually going over a wire of some sort (ie: places you really don't want to use eval even when it would work). So yeah, I think %a has *practical* utility. --David ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 04:24 AM, Antoine Pitrou wrote: On Tue, 25 Mar 2014 15:37:11 -0700 Ethan Furman wrote: ``%a`` will call ``ascii()`` on the interpolated value. This is intended as a debugging aid, rather than something that should be used in production. Non-ASCII values will be encoded to either ``\xnn`` or ``\u`` representation. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or even a rudimentary serialization format when defining ``__bytes__`` would not be appropriate [8]. The use cases you are enumerating for %a are chimeric. Cool word! Haven't seen it a long time. :) Did you *actually* do those things in real life, or are you inventing them for the PEP? The examples came from Jim Jewett, but I can easily see myself using them. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 04:42 AM, Nick Coghlan wrote: I also seem to recall Guido saying he liked it [%a], which flipped the discussion from do we have a good rationale for including it? to do we have a good rationale for the BDFL to ignore his instincts?. However, it would be up to Guido to confirm that recollection, and if Guido likes it is part of the reason for inclusion of the %a code, the PEP should mention that explicitly. I checked Guido's posts (Subject contains PEP 461, From contains guido) and did not see anything to that effect. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
Actually, I had ignored this discussion for so long that I was surprised by the outcome. My main use case isn't printing a number that may already be a string (I understand why that isn't reasonable when the output is expected to be bytes); it's printing a usually numeric value that may sometimes be None. It's a little surprising to have to use %a for this, but I guess I can live with it. On Thu, Mar 27, 2014 at 8:58 AM, Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 04:42 AM, Nick Coghlan wrote: I also seem to recall Guido saying he liked it [%a], which flipped the discussion from do we have a good rationale for including it? to do we have a good rationale for the BDFL to ignore his instincts?. However, it would be up to Guido to confirm that recollection, and if Guido likes it is part of the reason for inclusion of the %a code, the PEP should mention that explicitly. I checked Guido's posts (Subject contains PEP 461, From contains guido) and did not see anything to that effect. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against %b (rejected as not adding any value either in clarity or simplicity) is hardly a good reason. On Thu, Mar 27, 2014 at 10:20 AM, Guido van Rossum gu...@python.org wrote: Actually, I had ignored this discussion for so long that I was surprised by the outcome. My main use case isn't printing a number that may already be a string (I understand why that isn't reasonable when the output is expected to be bytes); it's printing a usually numeric value that may sometimes be None. It's a little surprising to have to use %a for this, but I guess I can live with it. On Thu, Mar 27, 2014 at 8:58 AM, Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 04:42 AM, Nick Coghlan wrote: I also seem to recall Guido saying he liked it [%a], which flipped the discussion from do we have a good rationale for including it? to do we have a good rationale for the BDFL to ignore his instincts?. However, it would be up to Guido to confirm that recollection, and if Guido likes it is part of the reason for inclusion of the %a code, the PEP should mention that explicitly. I checked Guido's posts (Subject contains PEP 461, From contains guido) and did not see anything to that effect. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org -- --Guido van Rossum (python.org/~guido) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 2014-03-27 15:58, Ethan Furman wrote: On 03/27/2014 04:42 AM, Nick Coghlan wrote: I also seem to recall Guido saying he liked it [%a], which flipped the discussion from do we have a good rationale for including it? to do we have a good rationale for the BDFL to ignore his instincts?. However, it would be up to Guido to confirm that recollection, and if Guido likes it is part of the reason for inclusion of the %a code, the PEP should mention that explicitly. I checked Guido's posts (Subject contains PEP 461, From contains guido) and did not see anything to that effect. Date: Mon, 13 Jan 2014 12:09:23 -0800 Subject: Re: [Python-Dev] PEP 460 reboot If we have %b for strictly interpolating bytes, I'm fine with adding %a for calling ascii() on the argument and then interpolating the result after ASCII-encoding it. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
Wow. I'm pretty consistent. I still like that. :-) On Thu, Mar 27, 2014 at 10:31 AM, MRAB pyt...@mrabarnett.plus.com wrote: On 2014-03-27 15:58, Ethan Furman wrote: On 03/27/2014 04:42 AM, Nick Coghlan wrote: I also seem to recall Guido saying he liked it [%a], which flipped the discussion from do we have a good rationale for including it? to do we have a good rationale for the BDFL to ignore his instincts?. However, it would be up to Guido to confirm that recollection, and if Guido likes it is part of the reason for inclusion of the %a code, the PEP should mention that explicitly. I checked Guido's posts (Subject contains PEP 461, From contains guido) and did not see anything to that effect. Date: Mon, 13 Jan 2014 12:09:23 -0800 Subject: Re: [Python-Dev] PEP 460 reboot If we have %b for strictly interpolating bytes, I'm fine with adding %a for calling ascii() on the argument and then interpolating the result after ASCII-encoding it. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 10:29 AM, Guido van Rossum wrote: I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against %b (rejected as not adding any value either in clarity or simplicity) is hardly a good reason. The biggest reason to use %s is to support a common code base for 2/3 endeavors. The biggest reason to not include %b is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't a very strong argument against it. I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's appropriate). -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, Mar 27, 2014 at 10:55 AM, Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 10:29 AM, Guido van Rossum wrote: I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against %b (rejected as not adding any value either in clarity or simplicity) is hardly a good reason. The biggest reason to use %s is to support a common code base for 2/3 endeavors. But it's mostly useless for that purpose. In Python 2, in practice %s doesn't mean string. It means use the default formatting just as if I was using print. And in theory it also means that -- in fact call __str__() is the formal definition, and print is also defined as using __str__, and this is all intentional. (I also intended __str__ to be *mostly* the same as __repr__, with a specific exception for the str type itself. In practice some frameworks have adopted a different interpretation, making __repr__ produce something *more* user friendly than __str__ but including newlines, because some people believe the main use case for __repr__ is the interactive prompt. I believe this causes problems for some *other* uses of __repr__, such as for producing an unambiguous representation useful for e.g. logging -- but I don't want to be too bitter about it. :-) The biggest reason to not include %b is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't a very strong argument against it. Especially since I can't imagine the spelling in format() includes '%'. I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's appropriate). I think it's mere existence (with the restrictions currently in the PEP) would cause more confusion than that is worth. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 10:55 AM, Ethan Furman wrote: On 03/27/2014 10:29 AM, Guido van Rossum wrote: I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against %b (rejected as not adding any value either in clarity or simplicity) is hardly a good reason. The biggest reason to use %s is to support a common code base for 2/3 endeavors. The biggest reason to not include %b is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't a very strong argument against it. I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's appropriate). Changed to: -- ``%b`` will insert a series of bytes. These bytes are collected in one of two ways: - input type supports ``Py_buffer`` [4]_? use it to collect the necessary bytes - input type is something else? use its ``__bytes__`` method [5]_ ; if there isn't one, raise a ``TypeError`` In particular, ``%b`` will not accept numbers nor ``str``. ``str`` is rejected as the string to bytes conversion requires an encoding, and we are refusing to guess; numbers are rejected because: - what makes a number is fuzzy (float? Decimal? Fraction? some user type?) - allowing numbers would lead to ambiguity between numbers and textual representations of numbers (3.14 vs '3.14') - given the nature of wire formats, explicit is definitely better than implicit ``%s`` is included as a synonym for ``%b`` for the sole purpose of making 2/3 code bases easier to maintain. Python 3 only code should use ``%b``. Examples:: b'%b' % b'abc' b'abc' b'%b' % 'some string'.encode('utf8') b'some string' b'%b' % 3.14 Traceback (most recent call last): ... TypeError: b'%b' does not accept 'float' b'%b' % 'hello world!' Traceback (most recent call last): ... TypeError: b'%b' does not accept 'str' -- -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 10:55 AM, Ethan Furman wrote: On 03/27/2014 10:29 AM, Guido van Rossum wrote: I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against %b (rejected as not adding any value either in clarity or simplicity) is hardly a good reason. The biggest reason to use %s is to support a common code base for 2/3 endeavors. The biggest reason to not include %b is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't a very strong argument against it. I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's appropriate). Changed to: -- ``%b`` will insert a series of bytes. These bytes are collected in one of two ways: - input type supports ``Py_buffer`` [4]_? use it to collect the necessary bytes - input type is something else? use its ``__bytes__`` method [5]_ ; if there isn't one, raise a ``TypeError`` In particular, ``%b`` will not accept numbers nor ``str``. ``str`` is rejected as the string to bytes conversion requires an encoding, and we are refusing to guess; numbers are rejected because: - what makes a number is fuzzy (float? Decimal? Fraction? some user type?) - allowing numbers would lead to ambiguity between numbers and textual representations of numbers (3.14 vs '3.14') - given the nature of wire formats, explicit is definitely better than implicit ``%s`` is included as a synonym for ``%b`` for the sole purpose of making 2/3 code bases easier to maintain. Python 3 only code should use ``%b``. Examples:: b'%b' % b'abc' b'abc' b'%b' % 'some string'.encode('utf8') b'some string' b'%b' % 3.14 Traceback (most recent call last): ... TypeError: b'%b' does not accept 'float' b'%b' % 'hello world!' Traceback (most recent call last): ... TypeError: b'%b' does not accept 'str' -- -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) But if we only add %b and leave out %s then how is this going to lead to Python 2/3 compatible code since %b is not in Python 2? Or am I misunderstanding you? -Brett On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 10:55 AM, Ethan Furman wrote: On 03/27/2014 10:29 AM, Guido van Rossum wrote: I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against %b (rejected as not adding any value either in clarity or simplicity) is hardly a good reason. The biggest reason to use %s is to support a common code base for 2/3 endeavors. The biggest reason to not include %b is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't a very strong argument against it. I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's appropriate). Changed to: -- ``%b`` will insert a series of bytes. These bytes are collected in one of two ways: - input type supports ``Py_buffer`` [4]_? use it to collect the necessary bytes - input type is something else? use its ``__bytes__`` method [5]_ ; if there isn't one, raise a ``TypeError`` In particular, ``%b`` will not accept numbers nor ``str``. ``str`` is rejected as the string to bytes conversion requires an encoding, and we are refusing to guess; numbers are rejected because: - what makes a number is fuzzy (float? Decimal? Fraction? some user type?) - allowing numbers would lead to ambiguity between numbers and textual representations of numbers (3.14 vs '3.14') - given the nature of wire formats, explicit is definitely better than implicit ``%s`` is included as a synonym for ``%b`` for the sole purpose of making 2/3 code bases easier to maintain. Python 3 only code should use ``%b``. Examples:: b'%b' % b'abc' b'abc' b'%b' % 'some string'.encode('utf8') b'some string' b'%b' % 3.14 Traceback (most recent call last): ... TypeError: b'%b' does not accept 'float' b'%b' % 'hello world!' Traceback (most recent call last): ... TypeError: b'%b' does not accept 'str' -- -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
I feel not including %s is nuts. Should I write .replace('%b', '%s')? All I desperately need are APIs that provide enough unicode / str type safety that I get an exception when mixing them accidentally... in my own code, dynamic typing is usually a bug. As has been endlessly discussed, %s for bytes is a bit like exposing sprintf()... On Thu, Mar 27, 2014 at 2:41 PM, Guido van Rossum gu...@python.org wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 10:55 AM, Ethan Furman wrote: On 03/27/2014 10:29 AM, Guido van Rossum wrote: I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against %b (rejected as not adding any value either in clarity or simplicity) is hardly a good reason. The biggest reason to use %s is to support a common code base for 2/3 endeavors. The biggest reason to not include %b is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't a very strong argument against it. I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's appropriate). Changed to: -- ``%b`` will insert a series of bytes. These bytes are collected in one of two ways: - input type supports ``Py_buffer`` [4]_? use it to collect the necessary bytes - input type is something else? use its ``__bytes__`` method [5]_ ; if there isn't one, raise a ``TypeError`` In particular, ``%b`` will not accept numbers nor ``str``. ``str`` is rejected as the string to bytes conversion requires an encoding, and we are refusing to guess; numbers are rejected because: - what makes a number is fuzzy (float? Decimal? Fraction? some user type?) - allowing numbers would lead to ambiguity between numbers and textual representations of numbers (3.14 vs '3.14') - given the nature of wire formats, explicit is definitely better than implicit ``%s`` is included as a synonym for ``%b`` for the sole purpose of making 2/3 code bases easier to maintain. Python 3 only code should use ``%b``. Examples:: b'%b' % b'abc' b'abc' b'%b' % 'some string'.encode('utf8') b'some string' b'%b' % 3.14 Traceback (most recent call last): ... TypeError: b'%b' does not accept 'float' b'%b' % 'hello world!' Traceback (most recent call last): ... TypeError: b'%b' does not accept 'str' -- -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
So what's the use case for Python 2/3 compatible code? IMO the main use case for the PEP is simply to be able to construct bytes from a combination of a template and some input that may include further bytes and numbers. E.g. in asyncio when you write an HTTP client or server you have to construct bytes to write to the socket, and I'd be happy if I could write b'HTTP/1.0 %d %b\r\n' % (status, message) rather than having to use str(status).encode('ascii') and concatenation or join(). On Thu, Mar 27, 2014 at 11:47 AM, Brett Cannon bcan...@gmail.com wrote: On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) But if we only add %b and leave out %s then how is this going to lead to Python 2/3 compatible code since %b is not in Python 2? Or am I misunderstanding you? -Brett On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman et...@stoneleaf.uswrote: On 03/27/2014 10:55 AM, Ethan Furman wrote: On 03/27/2014 10:29 AM, Guido van Rossum wrote: I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against %b (rejected as not adding any value either in clarity or simplicity) is hardly a good reason. The biggest reason to use %s is to support a common code base for 2/3 endeavors. The biggest reason to not include %b is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't a very strong argument against it. I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's appropriate). Changed to: -- ``%b`` will insert a series of bytes. These bytes are collected in one of two ways: - input type supports ``Py_buffer`` [4]_? use it to collect the necessary bytes - input type is something else? use its ``__bytes__`` method [5]_ ; if there isn't one, raise a ``TypeError`` In particular, ``%b`` will not accept numbers nor ``str``. ``str`` is rejected as the string to bytes conversion requires an encoding, and we are refusing to guess; numbers are rejected because: - what makes a number is fuzzy (float? Decimal? Fraction? some user type?) - allowing numbers would lead to ambiguity between numbers and textual representations of numbers (3.14 vs '3.14') - given the nature of wire formats, explicit is definitely better than implicit ``%s`` is included as a synonym for ``%b`` for the sole purpose of making 2/3 code bases easier to maintain. Python 3 only code should use ``%b``. Examples:: b'%b' % b'abc' b'abc' b'%b' % 'some string'.encode('utf8') b'some string' b'%b' % 3.14 Traceback (most recent call last): ... TypeError: b'%b' does not accept 'float' b'%b' % 'hello world!' Traceback (most recent call last): ... TypeError: b'%b' does not accept 'str' -- -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 11:24 AM, Guido van Rossum wrote: On Thu, Mar 27, 2014 at 10:55 AM, Ethan Furman wrote: The biggest reason to use %s is to support a common code base for 2/3 endeavors. But it's mostly useless for that purpose. In Python 2, in practice %s doesn't mean string. [...] In Python 2 if one is using 'str' as a 'bytes' container, and doing interpolation, %s is the only choice available for other 'bytes' (aka other 'str's). Note that I'm happy to be proven wrong on this point. :) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, Mar 27, 2014 at 11:52 AM, Daniel Holth dho...@gmail.com wrote: I feel not including %s is nuts. Should I write .replace('%b', '%s')? I assume you meant .replace('%s', '%b') (unless you're converting Python 3 code to Python 2, which would mean you really are nuts :-). But that's not going to help for the majority of code using %s -- as I am trying to argue, %s doesn't mean expect the argument to be a str and neither is that how it's commonly used (although it's *possible* that that is how *you* use it exclusively -- that doesn't make you nuts, just more strict than most people). All I desperately need are APIs that provide enough unicode / str type safety that I get an exception when mixing them accidentally... in my own code, dynamic typing is usually a bug. As has been endlessly discussed, %s for bytes is a bit like exposing sprintf()... I don't understand that last claim (I can't figure out whether in this context is exposing sprintf() is considered good or bad). But apart from that, can you give some specific examples? PS. I am not trying to be difficult. I honestly don't understand the use case yet, and the PEP doesn't do much to support it. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, Mar 27, 2014 at 11:34 AM, Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 11:24 AM, Guido van Rossum wrote: On Thu, Mar 27, 2014 at 10:55 AM, Ethan Furman wrote: The biggest reason to use %s is to support a common code base for 2/3 endeavors. But it's mostly useless for that purpose. In Python 2, in practice %s doesn't mean string. [...] In Python 2 if one is using 'str' as a 'bytes' container, and doing interpolation, %s is the only choice available for other 'bytes' (aka other 'str's). Note that I'm happy to be proven wrong on this point. :) That is true. And we can't change Python 2. I still have this idea in my head that *most* cases where %s is used in Python 2 will break in Python 3 under the PEP's rules, but perhaps they are not the majority of situations where the context is manipulating bytes. And I suppose that *very* few internet protocols are designed to accept either an integer or the literal string None, so that use case (which I brought up) isn't very realistic -- in fact it may be better to raise an exception rather than sending a protocol violation. So, I think you have changed my mind. I still like the idea of promoting %b in pure Python 3 code to emphasize that it really behaves very differently from %s; but I now have peace with %s as an alias. (It might also benefit cases where somehow there's a symmetry in some Python 3 code between bytes and str.) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 3/27/2014 11:59 AM, Guido van Rossum wrote: On Thu, Mar 27, 2014 at 11:52 AM, Daniel Holth dho...@gmail.com mailto:dho...@gmail.com wrote: I feel not including %s is nuts. Should I write .replace('%b', '%s')? I assume you meant .replace('%s', '%b') (unless you're converting Python 3 code to Python 2, which would mean you really are nuts :-). But that's not going to help for the majority of code using %s -- as I am trying to argue, %s doesn't mean expect the argument to be a str and neither is that how it's commonly used (although it's *possible* that that is how *you* use it exclusively -- that doesn't make you nuts, just more strict than most people). That _is_ how it is commonly used in Py2 when dealing with binary data in mixed ASCII/binary protocols, is what I've been hearing in this discussion, and what small use I've made of Py2 when some unported module forced me to use it (I started Python about the time Py3 was released)... the expected argument is a (Py2) str containing binary data (would be bytes in Py3). While there are many other reasons to use %s in other coding situations, this is the only way to do bytes interpolations using %. And there is no %b in Py2, so for Py2/3 compatibility, %s needs to do bytes interpolations in Py3. And if it does, there is no need for %b in Py3 %, because they would be identical and redundant. All I desperately need are APIs that provide enough unicode / str type safety that I get an exception when mixing them accidentally... in my own code, dynamic typing is usually a bug. As has been endlessly discussed, %s for bytes is a bit like exposing sprintf()... I don't understand that last claim (I can't figure out whether in this context is exposing sprintf() is considered good or bad). But apart from that, can you give some specific examples? PS. I am not trying to be difficult. I honestly don't understand the use case yet, and the PEP doesn't do much to support it. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, Mar 27, 2014 at 2:53 PM, Guido van Rossum gu...@python.org wrote: So what's the use case for Python 2/3 compatible code? IMO the main use case for the PEP is simply to be able to construct bytes from a combination of a template and some input that may include further bytes and numbers. E.g. in asyncio when you write an HTTP client or server you have to construct bytes to write to the socket, and I'd be happy if I could write b'HTTP/1.0 %d %b\r\n' % (status, message) rather than having to use str(status).encode('ascii') and concatenation or join(). It seems to be notoriously difficult to understand or explain why Unicode can still be very hard in Python 3 or in code that is in the middle of being ported or has to run in both interpreters. As far as I can tell part of it is when a symbol has type(str or bytes) depending (declared as if we had a static type system with union types); some of it is because incorrect mixing can happen without an exception, only to be discovered later and far away in space and time from the error (worse of all in a serialized file), and part of it is all of the not easily checkable types a particular Unicode object has depending on whether it contains surrogates or codes n. Sometimes you might simply disagree about whether an API should be returning bytes or Unicode in mildly ambiguous cases like base64 encoding. Sometimes Unicode is just intrinsically complicated. For me this PEP holds the promise of being able to do work in the bytes domain, with no accidental mixing ever, when I *really* want bytes. For 2+3 I would get exceptions sometimes in Python 2 and exceptions all the time in Python 3 for mistakes. I hope this is less error prone in strict domains than for example ustring processing.encode('latin1'). And I hope that there is very little type(str or int) in HTTP for example or other legitimate bytes domains but I don't know; I suspect that if you have a lot of problems with bytes' %s then it's a clue you should use (u%s % (argument)).encode() instead. sprintf()'s version of %s just takes a char* and puts it in without doing any type conversion of course. IANACL (I am not a C lawyer). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 11:53 AM, Guido van Rossum wrote: So what's the use case for Python 2/3 compatible code? IMO the main use case for the PEP is simply to be able to construct bytes from a combination of a template and some input that may include further bytes and numbers. E.g. in asyncio when you write an HTTP client or server you have to construct bytes to write to the socket, and I'd be happy if I could write b'HTTP/1.0 %d %b\r\n' % (status, message) rather than having to use str(status).encode('ascii') and concatenation or join(). My own dbf module [1] would make use of this feature, and I'm sure some of the pdf modules would as well (I recall somebody chiming in about their own pdf module). -- ~Ethan~ [1] https://pypi.python.org/pypi/dbf ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 11:59 AM, Guido van Rossum wrote: PS. I am not trying to be difficult. I honestly don't understand the use case yet, and the PEP doesn't do much to support it. How's this? Compatibility with Python 2 === As noted above, ``%s`` is being included solely to help ease migration from, and/or have a single code base with, Python 2. This is important as there are modules both in the wild and behind closed doors that currently use the Python 2 ``str`` type as a ``bytes`` container, and hence are using ``%s`` as a bytes interpolator. However, ``%b`` should be used in new, Python 3 only code, so ``%s`` will immediately be deprecated, but not removed until the next major Python release. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
I love it! On Thu, Mar 27, 2014 at 12:11 PM, Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 11:59 AM, Guido van Rossum wrote: PS. I am not trying to be difficult. I honestly don't understand the use case yet, and the PEP doesn't do much to support it. How's this? Compatibility with Python 2 === As noted above, ``%s`` is being included solely to help ease migration from, and/or have a single code base with, Python 2. This is important as there are modules both in the wild and behind closed doors that currently use the Python 2 ``str`` type as a ``bytes`` container, and hence are using ``%s`` as a bytes interpolator. However, ``%b`` should be used in new, Python 3 only code, so ``%s`` will immediately be deprecated, but not removed until the next major Python release. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 11:41 AM, Guido van Rossum wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) FWIW, I feel the same, but the need for compatible 2/3 code bases is real. Hey, how's this? We'll let %s in, but immediately deprecate it. ;) Of course, we won't remove it until Python IV. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, 27 Mar 2014 18:47:59 + Brett Cannon bcan...@gmail.com wrote: On Thu Mar 27 2014 at 2:42:40 PM, Guido van Rossum gu...@python.org wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) But if we only add %b and leave out %s then how is this going to lead to Python 2/3 compatible code since %b is not in Python 2? Or am I misunderstanding you? I think we have reached a point where adding porting-related facilities in 3.5 may actually slow down the pace of porting, rather than accelerate it (because people will then wait for 3.5 to start porting stuff). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Thu, 27 Mar 2014 11:57:35 -0700 Ethan Furman et...@stoneleaf.us wrote: On 03/27/2014 11:41 AM, Guido van Rossum wrote: Much better, but I'm still not happy with including %s at all. Otherwise it's accept-worthy. (How's that for pressure. :-) FWIW, I feel the same, but the need for compatible 2/3 code bases is real. Hey, how's this? We'll let %s in, but immediately deprecate it. ;) Of course, we won't remove it until Python IV. I vote for an environment variable-controlled feature activation (or with a registry key under Windows) ;) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
R. David Murray wrote: I've done the 'landmark' thing as well, in the string context; that can be very useful when doing incremental test driven development. (Granted, you could do that with __bytes__; Can't you do it more easily just by wrapping ascii() around the argument? That seems sufficient for debugging purposes to me. -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/27/2014 03:10 PM, Greg Ewing wrote: R. David Murray wrote: I've done the 'landmark' thing as well, in the string context; that can be very useful when doing incremental test driven development. (Granted, you could do that with __bytes__; Can't you do it more easily just by wrapping ascii() around the argument? That seems sufficient for debugging purposes to me. The problem there is ascii() still returns unicode (okay, okay, str), so you still have to encode it. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
2014-03-25 23:37 GMT+01:00 Ethan Furman et...@stoneleaf.us: ``%a`` will call ``ascii()`` on the interpolated value. I'm not sure that I understood correctly: is the %a format supported? The result of ascii() is a Unicode string. Does it mean that (%a % obj) should give the same result than ascii(obj).encode('ascii', 'strict')? Would it be possible to add a table or list to summarize supported format characters? I found: - single byte: %c - integer: %d, %u, %i, %o, %x, %X, %f, %g, etc. (can you please complete etc. ?) - bytes and __bytes__ method: %s - ascii(): %a I guess that the implementation of %a can avoid a conversion from ASCII (PyUnicode_DecodeASCII in the following code) and then a conversion to ASCII again (in bytes%args): PyObject * PyObject_ASCII(PyObject *v) { PyObject *repr, *ascii, *res; repr = PyObject_Repr(v); if (repr == NULL) return NULL; if (PyUnicode_IS_ASCII(repr)) return repr; /* repr is guaranteed to be a PyUnicode object by PyObject_Repr */ ascii = _PyUnicode_AsASCIIString(repr, backslashreplace); Py_DECREF(repr); if (ascii == NULL) return NULL; res = PyUnicode_DecodeASCII( HERE PyBytes_AS_STRING(ascii), PyBytes_GET_SIZE(ascii), NULL); Py_DECREF(ascii); return res; } This is intended as a debugging aid, rather than something that should be used in production. I don't understand the purpose of this sentence. Does it mean that %a must not be used? IMO this sentence can be removed. Non-ASCII values will be encoded to either ``\xnn`` or ``\u`` representation. Unicode is larger than that! print(ascii(chr(0x10))) = '\U0010' Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or even a rudimentary serialization format when defining ``__bytes__`` would not be appropriate [8]. I understand the debug use case. I'm not convinced by the serialization idea :-) .. note:: If a ``str`` is passed into ``%a``, it will be surrounded by quotes. And: - bytes gets a b prefix and surrounded by quotes as well (b'...') - the quote ' is escaped as \' if the string contains quotes ' and Can you also please add examples for %a? b%a % 123 b'123' b%s % ascii(bbytes) bb'bytes' b%s % text # hum, it's not easy to see surrounding quotes with this examples b'text' The following more complex examples are maybe not needed: b%a % euro:€ b'euro:\\u20ac' b%a % quotes ' b'\'quotes \\\'\'' Proposed variations === It would be fair to mention also a whole different PEP, Antoine's PEP 460! Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/26/2014 03:10 AM, Victor Stinner wrote: 2014-03-25 23:37 GMT+01:00 Ethan Furman: ``%a`` will call ``ascii()`` on the interpolated value. I'm not sure that I understood correctly: is the %a format supported? The result of ascii() is a Unicode string. Does it mean that (%a % obj) should give the same result than ascii(obj).encode('ascii', 'strict')? Changed to: --- ``%a`` will give the equivalent of ``repr(some_obj).encode('ascii', 'backslashreplace')`` on the interpolated value. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or any situation where defining ``__bytes__`` would not be appropriate but a readable/informative representation is needed [8]. --- Would it be possible to add a table or list to summarize supported format characters? I found: - single byte: %c - integer: %d, %u, %i, %o, %x, %X, %f, %g, etc. (can you please complete etc. ?) - bytes and __bytes__ method: %s - ascii(): %a Changed to: --- %-interpolation --- All the numeric formatting codes (``d``, ``i``, ``o``, ``u``, ``x``, ``X``, ``e``, ``E'', ``f``, ``F``, ``g``, ``G``, and any that are subsequently added to Python 3) will be supported, and will work as they do for str, including the padding, justification and other related modifiers (currently ``#``, ``0``, ``-``, `` `` (space), and ``+`` (plus any added to Python 3)). The only non-numeric codes allowed are ``c``, ``s``, and ``a``. For the numeric codes, the only difference between ``str`` and ``bytes`` (or ``bytearray``) interpolation is that the results from these codes will be ASCII-encoded text, not unicode. In other words, for any numeric formatting code `%x`:: --- I don't understand the purpose of this sentence. Does it mean that %a must not be used? IMO this sentence can be removed. The sentence about %a being for debugging has been removed. Non-ASCII values will be encoded to either ``\xnn`` or ``\u`` representation. Unicode is larger than that! print(ascii(chr(0x10))) = '\U0010' Removed. With the explicit reference to the 'backslashreplace' error handler any who want to know what it might look like can refer to that. .. note:: If a ``str`` is passed into ``%a``, it will be surrounded by quotes. And: - bytes gets a b prefix and surrounded by quotes as well (b'...') - the quote ' is escaped as \' if the string contains quotes ' and Shouldn't be an issue now with the new definition which no longer references the ascii() function. Can you also please add examples for %a? --- Examples:: b'%a' % 3.14 b'3.14' b'%a' % b'abc' b'abc' b'%a' % 'def' b'def' --- Proposed variations === It would be fair to mention also a whole different PEP, Antoine's PEP 460! My apologies for the omission. --- A competing PEP, ``PEP 460 Add binary interpolation and formatting`` [9], also exists. .. [9] http://python.org/dev/peps/pep-0460/ --- Thank you, Victor. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On Tue, Mar 25, 2014 at 11:37 PM, Ethan Furman et...@stoneleaf.us wrote: In particular, ``%s`` will not accept numbers (use a numeric format code for that), nor ``str`` (encode it to ``bytes``). I don't understand this restriction, and there isn't a rationale for it in the PEP (other than you can already use numeric formats, which doesn't explain why it's undesirable to have it anyway.) It is extremely common in existing 2.x code to use %s for anything, just like people use {} for anything with str.format. Not supporting this feels like it would be problematic for porting code. Did this come up in the earlier discussions? -- Thomas Wouters tho...@python.org Hi! I'm an email virus! Think twice before sending your email to help me spread! ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/26/2014 08:14 AM, Thomas Wouters wrote: On Tue, Mar 25, 2014 at 11:37 PM, Ethan Furman wrote: In particular, ``%s`` will not accept numbers (use a numeric format code for that), nor ``str`` (encode it to ``bytes``). I don't understand this restriction, and there isn't a rationale for it in the PEP (other than you can already use numeric formats, which doesn't explain why it's undesirable to have it anyway.) It is extremely common in existing 2.x code to use %s for anything And that's the problem -- in 2.x %s works always, but 3.x for bytes and bytearray %s will fail in numerous situations. It seems to me the main reason for using %s instead of %d is that 'some_var' may have a number, or it may have the textual representation of that number; in 3.x the first would succeed, the second would fail. That's the kind of intermittent failure we do not want. The PEP is not designed to make it so 2.x code can be ported as-is, but rather that 2.x code can be cleaned up (if necessary) and then run the same in both 2.x and 3.x (at least as far as byte and bytearray %-formatting is concerned). Did this come up in the earlier discussions? https://mail.python.org/pipermail/python-dev/2014-January/131576.html -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
2014-03-26 15:35 GMT+01:00 Ethan Furman et...@stoneleaf.us: --- Examples:: b'%a' % 3.14 b'3.14' b'%a' % b'abc' b'abc' This one is wrong: repr(b'abc').encode('ascii', 'backslashreplace') bb'abc' Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
On 03/26/2014 02:41 PM, Victor Stinner wrote: 2014-03-26 15:35 GMT+01:00 Ethan Furman et...@stoneleaf.us: --- Examples:: b'%a' % 3.14 b'3.14' b'%a' % b'abc' b'abc' This one is wrong: repr(b'abc').encode('ascii', 'backslashreplace') bb'abc' Fixed, thanks. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
Okay, I included that last round of comments (from late February). Barring typos, this should be the final version. Final comments? - PEP: 461 Title: Adding % formatting to bytes and bytearray Version: $Revision$ Last-Modified: $Date$ Author: Ethan Furman et...@stoneleaf.us Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-01-13 Python-Version: 3.5 Post-History: 2014-01-14, 2014-01-15, 2014-01-17, 2014-02-22, 2014-03-25 Resolution: Abstract This PEP proposes adding % formatting operations similar to Python 2's ``str`` type to ``bytes`` and ``bytearray`` [1]_ [2]_. Rationale = While interpolation is usually thought of as a string operation, there are cases where interpolation on ``bytes`` or ``bytearrays`` make sense, and the work needed to make up for this missing functionality detracts from the overall readability of the code. Motivation == With Python 3 and the split between ``str`` and ``bytes``, one small but important area of programming became slightly more difficult, and much more painful -- wire format protocols [3]_. This area of programming is characterized by a mixture of binary data and ASCII compatible segments of text (aka ASCII-encoded text). Bringing back a restricted %-interpolation for ``bytes`` and ``bytearray`` will aid both in writing new wire format code, and in porting Python 2 wire format code. Common use-cases include ``dbf`` and ``pdf`` file formats, ``email`` formats, and ``FTP`` and ``HTTP`` communications, among many others. Proposed semantics for ``bytes`` and ``bytearray`` formatting = %-interpolation --- All the numeric formatting codes (such as ``%x``, ``%o``, ``%e``, ``%f``, ``%g``, etc.) will be supported, and will work as they do for str, including the padding, justification and other related modifiers. The only difference will be that the results from these codes will be ASCII-encoded text, not unicode. In other words, for any numeric formatting code `%x`:: b%x % val is equivalent to (%x % val).encode(ascii) Examples:: b'%4x' % 10 b' a' b'%#4x' % 10 ' 0xa' b'%04X' % 10 '000A' ``%c`` will insert a single byte, either from an ``int`` in range(256), or from a ``bytes`` argument of length 1, not from a ``str``. Examples:: b'%c' % 48 b'0' b'%c' % b'a' b'a' ``%s`` is included for two reasons: 1) `b` is already a format code for ``format`` numerics (binary), and 2) it will make 2/3 code easier as Python 2.x code uses ``%s``; however, it is restricted in what it will accept:: - input type supports ``Py_buffer`` [6]_? use it to collect the necessary bytes - input type is something else? use its ``__bytes__`` method [7]_ ; if there isn't one, raise a ``TypeError`` In particular, ``%s`` will not accept numbers (use a numeric format code for that), nor ``str`` (encode it to ``bytes``). Examples:: b'%s' % b'abc' b'abc' b'%s' % 'some string'.encode('utf8') b'some string' b'%s' % 3.14 Traceback (most recent call last): ... TypeError: b'%s' does not accept numbers, use a numeric code instead b'%s' % 'hello world!' Traceback (most recent call last): ... TypeError: b'%s' does not accept 'str', it must be encoded to `bytes` ``%a`` will call ``ascii()`` on the interpolated value. This is intended as a debugging aid, rather than something that should be used in production. Non-ASCII values will be encoded to either ``\xnn`` or ``\u`` representation. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or even a rudimentary serialization format when defining ``__bytes__`` would not be appropriate [8]. .. note:: If a ``str`` is passed into ``%a``, it will be surrounded by quotes. Unsupported codes - ``%r`` (which calls ``__repr__`` and returns a ``str``) is not supported. Proposed variations === It was suggested to let ``%s`` accept numbers, but since numbers have their own format codes this idea was discarded. It has been suggested to use ``%b`` for bytes as well as ``%s``. This was rejected as not adding any value either in clarity or simplicity. It has been proposed to automatically use ``.encode('ascii','strict')`` for ``str`` arguments to ``%s``. - Rejected as this would lead to intermittent failures. Better to have the operation always fail so the trouble-spot can be correctly fixed. It has been proposed to have ``%s`` return the ascii-encoded repr when the value is a ``str`` (b'%s' % 'abc' -- b'abc'). - Rejected as this would lead to hard to debug failures far from the problem site. Better to
Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
I love it. On Tue, Mar 25, 2014 at 6:37 PM, Ethan Furman et...@stoneleaf.us wrote: Okay, I included that last round of comments (from late February). Barring typos, this should be the final version. Final comments? - PEP: 461 Title: Adding % formatting to bytes and bytearray Version: $Revision$ Last-Modified: $Date$ Author: Ethan Furman et...@stoneleaf.us Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-01-13 Python-Version: 3.5 Post-History: 2014-01-14, 2014-01-15, 2014-01-17, 2014-02-22, 2014-03-25 Resolution: Abstract This PEP proposes adding % formatting operations similar to Python 2's ``str`` type to ``bytes`` and ``bytearray`` [1]_ [2]_. Rationale = While interpolation is usually thought of as a string operation, there are cases where interpolation on ``bytes`` or ``bytearrays`` make sense, and the work needed to make up for this missing functionality detracts from the overall readability of the code. Motivation == With Python 3 and the split between ``str`` and ``bytes``, one small but important area of programming became slightly more difficult, and much more painful -- wire format protocols [3]_. This area of programming is characterized by a mixture of binary data and ASCII compatible segments of text (aka ASCII-encoded text). Bringing back a restricted %-interpolation for ``bytes`` and ``bytearray`` will aid both in writing new wire format code, and in porting Python 2 wire format code. Common use-cases include ``dbf`` and ``pdf`` file formats, ``email`` formats, and ``FTP`` and ``HTTP`` communications, among many others. Proposed semantics for ``bytes`` and ``bytearray`` formatting = %-interpolation --- All the numeric formatting codes (such as ``%x``, ``%o``, ``%e``, ``%f``, ``%g``, etc.) will be supported, and will work as they do for str, including the padding, justification and other related modifiers. The only difference will be that the results from these codes will be ASCII-encoded text, not unicode. In other words, for any numeric formatting code `%x`:: b%x % val is equivalent to (%x % val).encode(ascii) Examples:: b'%4x' % 10 b' a' b'%#4x' % 10 ' 0xa' b'%04X' % 10 '000A' ``%c`` will insert a single byte, either from an ``int`` in range(256), or from a ``bytes`` argument of length 1, not from a ``str``. Examples:: b'%c' % 48 b'0' b'%c' % b'a' b'a' ``%s`` is included for two reasons: 1) `b` is already a format code for ``format`` numerics (binary), and 2) it will make 2/3 code easier as Python 2.x code uses ``%s``; however, it is restricted in what it will accept:: - input type supports ``Py_buffer`` [6]_? use it to collect the necessary bytes - input type is something else? use its ``__bytes__`` method [7]_ ; if there isn't one, raise a ``TypeError`` In particular, ``%s`` will not accept numbers (use a numeric format code for that), nor ``str`` (encode it to ``bytes``). Examples:: b'%s' % b'abc' b'abc' b'%s' % 'some string'.encode('utf8') b'some string' b'%s' % 3.14 Traceback (most recent call last): ... TypeError: b'%s' does not accept numbers, use a numeric code instead b'%s' % 'hello world!' Traceback (most recent call last): ... TypeError: b'%s' does not accept 'str', it must be encoded to `bytes` ``%a`` will call ``ascii()`` on the interpolated value. This is intended as a debugging aid, rather than something that should be used in production. Non-ASCII values will be encoded to either ``\xnn`` or ``\u`` representation. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or even a rudimentary serialization format when defining ``__bytes__`` would not be appropriate [8]. .. note:: If a ``str`` is passed into ``%a``, it will be surrounded by quotes. Unsupported codes - ``%r`` (which calls ``__repr__`` and returns a ``str``) is not supported. Proposed variations === It was suggested to let ``%s`` accept numbers, but since numbers have their own format codes this idea was discarded. It has been suggested to use ``%b`` for bytes as well as ``%s``. This was rejected as not adding any value either in clarity or simplicity. It has been proposed to automatically use ``.encode('ascii','strict')`` for ``str`` arguments to ``%s``. - Rejected as this would lead to intermittent failures. Better to have the operation always fail so the trouble-spot can be correctly fixed. It has been proposed to