Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com wrote: For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). +1 -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 3:06 PM, Jan Kaliszewski z...@chopin.edu.pl wrote: I'd treat the format()+.__format__()+str.format()-ecosystem as a nice text-data-oriented, *complete* Py3k feature, backported to Python 2 to share the benefits of the feature with it as well as to make the 2-to-3 transition a bit easier. IMHO, the PEP-3101's note cited above just describes a workaround over the flaws of the Py2's obsolete text model. Moving such complications into Py3k would make the feature (and especially the ability to implement your own .__format__()) harder to understand and make use of -- for little profit. Such a move is not needed for compatibility. And, IMHO, the format()/__format__()/str.format()-matter is all about nice and flexible *text* formatting, not about binary data interpolation. [disclaimer: I personally don't have many use cases for any bytes formatting.] Yet there is still a strong symmetry between str and bytes that makes bytes easier to use. I don't always use formatting, but when I do I use .format(). :) never-been-a-fan-of-mod-formatting-ly yours, -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com wrote: On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com wrote: For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). +1 Please don't make me learn the limitations of a new mini language without a really good reason. For the sake of argument, assume we have a Python 3.5 with bytes.__mod__ restored roughly as described in PEP 461. *Given* that feature set, what is the rationale for *adding* bytes.format? What new capabilities will it provide that aren't already covered by printf-style interpolation directly to bytes or text formatting followed by encoding the result? Cheers, Nick. -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/17/2014 6:42 AM, Nick Coghlan wrote: On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com mailto:ericsnowcurren...@gmail.com wrote: On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com mailto:e...@trueblade.com wrote: For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). +1 Please don't make me learn the limitations of a new mini language without a really good reason. For the sake of argument, assume we have a Python 3.5 with bytes.__mod__ restored roughly as described in PEP 461. *Given* that feature set, what is the rationale for *adding* bytes.format? What new capabilities will it provide that aren't already covered by printf-style interpolation directly to bytes or text formatting followed by encoding the result? The only reason to add any of this, in my mind, is to ease porting of 2.x code. If my proposal covers most of the cases of b''.format() that exist in 2.x code that wants to move to 3.5, then I think it's worth doing. Is there any such code that's blocked from porting by the lack of b''.format() that supports bytes, int, and float? I don't know. I concede that it's unlikely. IF this were a feature that we were going to add to 3.5 on its own merits, I think we add __format_ascii__ and make the whole thing extensible. Is there any new code that's blocked from being written by missing b.format()? I don't know that, either. Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/17/2014 07:34 AM, Eric V. Smith wrote: On 1/17/2014 6:42 AM, Nick Coghlan wrote: On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com mailto:ericsnowcurren...@gmail.com wrote: On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com mailto:e...@trueblade.com wrote: For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). +1 Please don't make me learn the limitations of a new mini language without a really good reason. For the sake of argument, assume we have a Python 3.5 with bytes.__mod__ restored roughly as described in PEP 461. *Given* that feature set, what is the rationale for *adding* bytes.format? What new capabilities will it provide that aren't already covered by printf-style interpolation directly to bytes or text formatting followed by encoding the result? The only reason to add any of this, in my mind, is to ease porting of 2.x code. If my proposal covers most of the cases of b''.format() that exist in 2.x code that wants to move to 3.5, then I think it's worth doing. Is there any such code that's blocked from porting by the lack of b''.format() that supports bytes, int, and float? I don't know. I concede that it's unlikely. IF this were a feature that we were going to add to 3.5 on its own merits, I think we add __format_ascii__ and make the whole thing extensible. Is there any new code that's blocked from being written by missing b.format()? I don't know that, either. Following up, I think this leaves us with 3 choices: 1. Do not implement bytes.format(). We tell any 2.x code that's written to use str.format() to switch to %-formatting for their common code base. 2. Add the simplistic version of bytes.format() that I describe above, restricted to accepting bytes, int, and float (and no subclasses). Some 2.x code will work, some will need to change to %-formatting. 3. Add bytes.format() and the __format_ascii__ protocol. We might want to also add a format_ascii() builtin, to match __format__ and format(). This would require the least change to 2.x code that uses str.format() and wants to move to bytes.format(), but would require some work on the 3.x side. I'd advocate 1 or 2. Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 17/01/2014 14:50, Eric V. Smith wrote: On 01/17/2014 07:34 AM, Eric V. Smith wrote: On 1/17/2014 6:42 AM, Nick Coghlan wrote: On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com mailto:ericsnowcurren...@gmail.com wrote: On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com mailto:e...@trueblade.com wrote: For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). +1 Please don't make me learn the limitations of a new mini language without a really good reason. For the sake of argument, assume we have a Python 3.5 with bytes.__mod__ restored roughly as described in PEP 461. *Given* that feature set, what is the rationale for *adding* bytes.format? What new capabilities will it provide that aren't already covered by printf-style interpolation directly to bytes or text formatting followed by encoding the result? The only reason to add any of this, in my mind, is to ease porting of 2.x code. If my proposal covers most of the cases of b''.format() that exist in 2.x code that wants to move to 3.5, then I think it's worth doing. Is there any such code that's blocked from porting by the lack of b''.format() that supports bytes, int, and float? I don't know. I concede that it's unlikely. IF this were a feature that we were going to add to 3.5 on its own merits, I think we add __format_ascii__ and make the whole thing extensible. Is there any new code that's blocked from being written by missing b.format()? I don't know that, either. Following up, I think this leaves us with 3 choices: 1. Do not implement bytes.format(). We tell any 2.x code that's written to use str.format() to switch to %-formatting for their common code base. 2. Add the simplistic version of bytes.format() that I describe above, restricted to accepting bytes, int, and float (and no subclasses). Some 2.x code will work, some will need to change to %-formatting. 3. Add bytes.format() and the __format_ascii__ protocol. We might want to also add a format_ascii() builtin, to match __format__ and format(). This would require the least change to 2.x code that uses str.format() and wants to move to bytes.format(), but would require some work on the 3.x side. I'd advocate 1 or 2. Eric. For both options 1 and 2 surely you cannot be suggesting that after people have written 2.x code to use format() as %f formatting is to be deprecated, they now have to change the code back to the way they may well have written it in the first place? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/17/2014 10:15 AM, Mark Lawrence wrote: On 17/01/2014 14:50, Eric V. Smith wrote: On 01/17/2014 07:34 AM, Eric V. Smith wrote: On 1/17/2014 6:42 AM, Nick Coghlan wrote: On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com mailto:ericsnowcurren...@gmail.com wrote: On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com mailto:e...@trueblade.com wrote: For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). +1 Please don't make me learn the limitations of a new mini language without a really good reason. For the sake of argument, assume we have a Python 3.5 with bytes.__mod__ restored roughly as described in PEP 461. *Given* that feature set, what is the rationale for *adding* bytes.format? What new capabilities will it provide that aren't already covered by printf-style interpolation directly to bytes or text formatting followed by encoding the result? The only reason to add any of this, in my mind, is to ease porting of 2.x code. If my proposal covers most of the cases of b''.format() that exist in 2.x code that wants to move to 3.5, then I think it's worth doing. Is there any such code that's blocked from porting by the lack of b''.format() that supports bytes, int, and float? I don't know. I concede that it's unlikely. IF this were a feature that we were going to add to 3.5 on its own merits, I think we add __format_ascii__ and make the whole thing extensible. Is there any new code that's blocked from being written by missing b.format()? I don't know that, either. Following up, I think this leaves us with 3 choices: 1. Do not implement bytes.format(). We tell any 2.x code that's written to use str.format() to switch to %-formatting for their common code base. 2. Add the simplistic version of bytes.format() that I describe above, restricted to accepting bytes, int, and float (and no subclasses). Some 2.x code will work, some will need to change to %-formatting. 3. Add bytes.format() and the __format_ascii__ protocol. We might want to also add a format_ascii() builtin, to match __format__ and format(). This would require the least change to 2.x code that uses str.format() and wants to move to bytes.format(), but would require some work on the 3.x side. I'd advocate 1 or 2. Eric. For both options 1 and 2 surely you cannot be suggesting that after people have written 2.x code to use format() as %f formatting is to be deprecated, they now have to change the code back to the way they may well have written it in the first place? That would be part of it, yes. Otherwise you need #3. This is all assuming we've ruled out an option 4, because of the exceptions raised depending on what __format__ does: 4. Add bytes.format(), have it convert the format specifier to str (unicode), call __format__ and encode the result back to ASCII. Accept that there will be data-driven exceptions depending on the result of the __format__ call. I'm open to other ideas. Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/17/2014 07:15 AM, Mark Lawrence wrote: For both options 1 and 2 surely you cannot be suggesting that after people have written 2.x code to use format() as %f formatting is to be deprecated %f formatting is not deprecated, and will not be in 3.x's lifetime. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/17/2014 10:24 AM, Eric V. Smith wrote: On 01/17/2014 10:15 AM, Mark Lawrence wrote: On 17/01/2014 14:50, Eric V. Smith wrote: On 01/17/2014 07:34 AM, Eric V. Smith wrote: On 1/17/2014 6:42 AM, Nick Coghlan wrote: On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com mailto:ericsnowcurren...@gmail.com wrote: On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com mailto:e...@trueblade.com wrote: For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). +1 Please don't make me learn the limitations of a new mini language without a really good reason. For the sake of argument, assume we have a Python 3.5 with bytes.__mod__ restored roughly as described in PEP 461. *Given* that feature set, what is the rationale for *adding* bytes.format? What new capabilities will it provide that aren't already covered by printf-style interpolation directly to bytes or text formatting followed by encoding the result? The only reason to add any of this, in my mind, is to ease porting of 2.x code. If my proposal covers most of the cases of b''.format() that exist in 2.x code that wants to move to 3.5, then I think it's worth doing. Is there any such code that's blocked from porting by the lack of b''.format() that supports bytes, int, and float? I don't know. I concede that it's unlikely. IF this were a feature that we were going to add to 3.5 on its own merits, I think we add __format_ascii__ and make the whole thing extensible. Is there any new code that's blocked from being written by missing b.format()? I don't know that, either. Following up, I think this leaves us with 3 choices: 1. Do not implement bytes.format(). We tell any 2.x code that's written to use str.format() to switch to %-formatting for their common code base. 2. Add the simplistic version of bytes.format() that I describe above, restricted to accepting bytes, int, and float (and no subclasses). Some 2.x code will work, some will need to change to %-formatting. 3. Add bytes.format() and the __format_ascii__ protocol. We might want to also add a format_ascii() builtin, to match __format__ and format(). This would require the least change to 2.x code that uses str.format() and wants to move to bytes.format(), but would require some work on the 3.x side. For #3, hopefully this additional work on the 3.x side would just be to add, to each class where you already have a custom __format__ used for b''.format(), code like: def __format_ascii__(self, fmt): return self.__format__(fmt.decode()).encode('ascii') That is, we're pushing the possibility of having to deal with an encoding exception off to the type, instead of having it live in bytes.format(). And to agree with Ethan: %-formatting isn't deprecated. Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 17/01/2014 15:41, Ethan Furman wrote: On 01/17/2014 07:15 AM, Mark Lawrence wrote: For both options 1 and 2 surely you cannot be suggesting that after people have written 2.x code to use format() as %f formatting is to be deprecated %f formatting is not deprecated, and will not be in 3.x's lifetime. -- ~Ethan~ I'm sorry, I got the above wrong, I should have said was to be deprecated :( -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Fri, Jan 17, 2014 at 9:50 AM, Eric V. Smith e...@trueblade.com wrote: On 01/17/2014 07:34 AM, Eric V. Smith wrote: On 1/17/2014 6:42 AM, Nick Coghlan wrote: On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com mailto:ericsnowcurren...@gmail.com wrote: On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com mailto:e...@trueblade.com wrote: For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). +1 Please don't make me learn the limitations of a new mini language without a really good reason. For the sake of argument, assume we have a Python 3.5 with bytes.__mod__ restored roughly as described in PEP 461. *Given* that feature set, what is the rationale for *adding* bytes.format? What new capabilities will it provide that aren't already covered by printf-style interpolation directly to bytes or text formatting followed by encoding the result? The only reason to add any of this, in my mind, is to ease porting of 2.x code. If my proposal covers most of the cases of b''.format() that exist in 2.x code that wants to move to 3.5, then I think it's worth doing. Is there any such code that's blocked from porting by the lack of b''.format() that supports bytes, int, and float? I don't know. I concede that it's unlikely. IF this were a feature that we were going to add to 3.5 on its own merits, I think we add __format_ascii__ and make the whole thing extensible. Is there any new code that's blocked from being written by missing b.format()? I don't know that, either. Following up, I think this leaves us with 3 choices: 1. Do not implement bytes.format(). We tell any 2.x code that's written to use str.format() to switch to %-formatting for their common code base. +1 I would rephrase it to switch to %-formatting for bytes usage for their common code base. If they are working with actual text then using str.format() still works (and is actually nicer to use IMO). It actually might make the str/bytes relationship even clearer, especially if we start to promote that str.format() is for text and %-formatting is for bytes. 2. Add the simplistic version of bytes.format() that I describe above, restricted to accepting bytes, int, and float (and no subclasses). Some 2.x code will work, some will need to change to %-formatting. -1 I am still not comfortable with the special-casing by type for bytes.format(). 3. Add bytes.format() and the __format_ascii__ protocol. We might want to also add a format_ascii() builtin, to match __format__ and format(). This would require the least change to 2.x code that uses str.format() and wants to move to bytes.format(), but would require some work on the 3.x side. +0 Would allow for easy porting and it's general enough, but I don't know if working with bytes really requires this much beyond supporting the porting story. I'm still +1 on PEP 460 for bytes.format() as a nice way to simplify basic bytes usage in Python 3, but if that's not accepted then I say just drop bytes.format() entirely and let %-formatting be the way people do Python 2/3 bytes work (if they are not willing to build it up from scratch like they already can do). -Brett ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 17 January 2014 15:50, Eric V. Smith e...@trueblade.com wrote: For #3, hopefully this additional work on the 3.x side would just be to add, to each class where you already have a custom __format__ used for b''.format(), code like: def __format_ascii__(self, fmt): return self.__format__(fmt.decode()).encode('ascii') For me, the big cost would seem to be in the necessary documentation, explaining the new special method in the language reference, explaining the 2 different forms of format() in the built in types docs. And the conceptual overhead of another special method for people to be aware of. If I implement my own number subclass, do I need to implement __format_ascii__? My gut feeling is that we simply don't implement format() for bytes. I don't see sufficient benefit, if %-formatting is available. Paul. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Jan 17, 2014, at 11:00 AM, Brett Cannon wrote: I would rephrase it to switch to %-formatting for bytes usage for their common code base. -1. %-formatting is so neanderthal. :) -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 18 Jan 2014 02:08, Paul Moore p.f.mo...@gmail.com wrote: On 17 January 2014 15:50, Eric V. Smith e...@trueblade.com wrote: For #3, hopefully this additional work on the 3.x side would just be to add, to each class where you already have a custom __format__ used for b''.format(), code like: def __format_ascii__(self, fmt): return self.__format__(fmt.decode()).encode('ascii') For me, the big cost would seem to be in the necessary documentation, explaining the new special method in the language reference, explaining the 2 different forms of format() in the built in types docs. And the conceptual overhead of another special method for people to be aware of. If I implement my own number subclass, do I need to implement __format_ascii__? My gut feeling is that we simply don't implement format() for bytes. I don't see sufficient benefit, if %-formatting is available. Exactly, it's the documentation problem to explain when would I recommend using this over the alternatives? that turns me off the idea of general purpose bytes formatting. printf style covers the use cases we have identified, and the code bases of immediate interest support 2.5 or earlier and thus *must* be using printf-style formatting. Add to that the fact that to maintain the Python 3 text model, we either have to gut it to the point where it has very few of the benefits the text version offers printf-style formatting, or else we introduce a whole new protocol for a feature that we consider so borderline that it took us six Python 3 releases to add it back to the language. By contrast, the following model is relatively easy to document: * printf-style is low level and relatively inflexible, but available for both text and for ASCII compatible segments in binary data. The %s formatting code accepts arbitrary objects (using str) in text mode, but only buffer exporters and objects with a __bytes__ method in binary mode. * the format is high level and very flexible, but available only for text - the result must be explicitly encoded to binary if that is needed. Cheers, Nick. Paul. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/16/2014 11:47 PM, Steven D'Aprano wrote: On Thu, Jan 16, 2014 at 08:23:13AM -0800, Ethan Furman wrote: As I understand it, str.format will call the object's __format__. So, for example, if I say: u'the value is: %d' % myNum(17) then it will be myNum.__format__ that gets called, not int.__format__; I seem to have missed something, because I am completely confused... Why are you talking about str.format and then show an example using % instead? Sorry, PEP 46x fatigue. :/ It should have been u'the value is {:d}'.format(myNum(17)) and yes I meant the str type. %d calls __str__, not __format__. This is in Python 3.3: py class MyNum(int): ... def __str__(self): ... print(Calling MyNum.__str__) ... return super().__str__() ... def __format__(self): ... print(Calling MyNum.__format__) ... return super().__format__() ... py n = MyNum(17) py u%d % n Calling MyNum.__str__ '17' And that's a bug we fixed in 3.4: Python 3.4.0b1 (default:172a6bfdd91b+, Jan 5 2014, 06:39:32) [GCC 4.7.3] on linux Type help, copyright, credits or license for more information. -- class myNum(int): ... def __int__(self): ... return 7 ... def __index__(self): ... return 11 ... def __float__(self): ... return 13.81727 ... def __str__(self): ... print('__str__') ... return '1' ... def __repr__(self): ... print('__repr__') ... return '2' ... -- '%d' % myNum() '0' -- '%f' % myNum() '13.817270' After all, consider: '%d' % True '1' '%s' % True 'True' So, in fact, on subclasses __str__ should *not* be called to get the integer representation. First we do a conversion to make sure we have an int (or float, or ...), and then we call __str__ on our tried and trusted genuine core type. The *worst* solution would be to completely ignore MyNum.__str__. That's a nasty violation of the Principle Of Least Surprise, and will lead to confusion (why isn't my class' __str__ method being called?) Because you asked for a numeric representation, not a string representation [1]. -- ~Ethan~ [1] for all the gory details, see: http://bugs.python.org/issue18780 http://bugs.python.org/issue18738 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Fri, Jan 17, 2014 at 11:16 AM, Barry Warsaw ba...@python.org wrote: On Jan 17, 2014, at 11:00 AM, Brett Cannon wrote: I would rephrase it to switch to %-formatting for bytes usage for their common code base. -1. %-formatting is so neanderthal. :) Very much so, which is why I'm willing to let it be bastardized in Python 3.5 for the sake of porting but not bytes.format(). =) I'm keeping format() clean for my nieces and nephew to use; they can just turn their nose up at %-formatting when they are old enough to program. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/17/2014 11:58 AM, Brett Cannon wrote: On Fri, Jan 17, 2014 at 11:16 AM, Barry Warsaw ba...@python.org mailto:ba...@python.org wrote: On Jan 17, 2014, at 11:00 AM, Brett Cannon wrote: I would rephrase it to switch to %-formatting for bytes usage for their common code base. -1. %-formatting is so neanderthal. :) Very much so, which is why I'm willing to let it be bastardized in Python 3.5 for the sake of porting but not bytes.format(). =) I'm keeping format() clean for my nieces and nephew to use; they can just turn their nose up at %-formatting when they are old enough to program. Given the problems with implementing it, I'm more than willing to drop bytes.format() from PEP 461 (not that it's my PEP). But if we think that %-formatting is neanderthal and will get dropped in the Python 4000 timeframe (that is, someday in the far future), then I think we should have some advice to give to people who are writing new 3.x code for the non-porting use-cases addressed by the PEP. I'm specifically thinking of new code that wants to format some bytes for an on-the-wire ascii-like protocol. Is it: b'Content-Length: ' + str(47).encode('ascii') or b'Content-Length: {}.format(str(47).encode('ascii')) or something better? I think it will look like the above, or involve something like bytes.format() and __format_ascii__. Or, maybe a library that just supports a few types (say, bytes, int, and float!). Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/17/2014 09:13 AM, Eric V. Smith wrote: On 01/17/2014 11:58 AM, Brett Cannon wrote: On Fri, Jan 17, 2014 at 11:16 AM, Barry Warsaw wrote: On Jan 17, 2014, at 11:00 AM, Brett Cannon wrote: I would rephrase it to switch to %-formatting for bytes usage for their common code base. -1. %-formatting is so neanderthal. :) Very much so, which is why I'm willing to let it be bastardized in Python 3.5 for the sake of porting but not bytes.format(). =) I'm keeping format() clean for my nieces and nephew to use; they can just turn their nose up at %-formatting when they are old enough to program. Given the problems with implementing it, I'm more than willing to drop bytes.format() from PEP 461 (not that it's my PEP). But if we think that %-formatting is neanderthal and will get dropped in the Python 4000 timeframe I hope not! (that is, someday in the far future), then I think we should have some advice to give to people who are writing new 3.x code for the non-porting use-cases addressed by the PEP. I'm specifically thinking of new code that wants to format some bytes for an on-the-wire ascii-like protocol. %-interpolation handles this use case well, format does not. Is it: b'Content-Length: ' + str(47).encode('ascii') or b'Content-Length: {}.format(str(47).encode('ascii')) or something better? Ew. Neither of those look better than b'Content-Length: %d' % 47 -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/17/2014 6:50 AM, Eric V. Smith wrote: Following up, I think this leaves us with 3 choices: 1. Do not implement bytes.format(). We tell any 2.x code that's written to use str.format() to switch to %-formatting for their common code base. 2. Add the simplistic version of bytes.format() that I describe above, restricted to accepting bytes, int, and float (and no subclasses). Some 2.x code will work, some will need to change to %-formatting. 3. Add bytes.format() and the __format_ascii__ protocol. We might want to also add a format_ascii() builtin, to match __format__ and format(). This would require the least change to 2.x code that uses str.format() and wants to move to bytes.format(), but would require some work on the 3.x side. I'd advocate 1 or 2. Nice summary. I'd advocate 1 or 3. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/17/2014 7:15 AM, Mark Lawrence wrote: For both options 1 and 2 surely you cannot be suggesting that after people have written 2.x code to use format() as %f formatting is to be deprecated, they now have to change the code back to the way they may well have written it in the first place? If they are committed to format(), another option is to operate in the Unicode domain, and encode at the end. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/17/2014 02:04 PM, Glenn Linderman wrote: On 1/17/2014 7:15 AM, Mark Lawrence wrote: For both options 1 and 2 surely you cannot be suggesting that after people have written 2.x code to use format() as %f formatting is to be deprecated, they now have to change the code back to the way they may well have written it in the first place? If they are committed to format(), another option is to operate in the Unicode domain, and encode at the end. Maybe that's the best advice to give. It's better than my earlier example of field-at-a-time encoding. Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/17/2014 10:15 AM, Mark Lawrence wrote: For both options 1 and 2 surely you cannot be suggesting that after people have written 2.x code to use format() as %f formatting is to be deprecated, I will not be for at least a decade. they now have to change the code back to the way they may well have written it in the first place? I would suggest that people simply .encode the result if bytes are needed in 3.x as well as 2.x. Polyglot code will likely have a 'py3' boolean already to make the encoding conditional. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Responding to two posts at once, as I consider them On 1/17/2014 11:00 AM, Brett Cannon wrote: I would rephrase it to switch to %-formatting for bytes usage for their common code base. If they are working with actual text then using str.format() still works (and is actually nicer to use IMO). It actually might make the str/bytes relationship even clearer, especially if we start to promote that str.format() is for text and %-formatting is for bytes. Good idea, I think: printf % formatting was invented for formatting ascii text in bytestrings as it was being output (although sprintf allowed not-output). In retrospect, I think we should have introduced unicode.format when unicode was introduced in 2.0 and perhap never have had unicode % formatting. Or we should have dropped str % instead of bytes % in 3.0. On 1/17/2014 12:13 PM, Eric V. Smith wrote: But if we think that %-formatting is neanderthal and will get dropped in the Python 4000 timeframe (that is, someday in the far future), Some people, such as Martin Loewis, have a different opinion of %-formatting and will fight deprecating it *ever*. (I suspect that %-format opinions are influenced by one's current relation to C.) then I think we should have some advice to give to people who are writing new 3.x code for the non-porting use-cases addressed by the PEP. I'm specifically thinking of new code that wants to format some bytes for an on-the-wire ascii-like protocol. If we add %-formatting back in 3.5 for its original purpose, formatting ascii in bytes for output, I think we should drop the idea of later deprecating it (a few releases later) for that purpose. I think the PEP should even say so, that bytes % will remain indefinitely even if str % were to be dropped in favor of str.format. I would consider dropping unicode(now string).__mod__ in favor of .format to still be an eventual option, especially if someone were to write a converter. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 18 Jan 2014 06:19, Terry Reedy tjre...@udel.edu wrote: On 1/17/2014 10:15 AM, Mark Lawrence wrote: For both options 1 and 2 surely you cannot be suggesting that after people have written 2.x code to use format() as %f formatting is to be deprecated, I will not be for at least a decade. It will not be deprecated, period. Originally, we thought that the introduction of the new flexible text formatting system made printf-style formatting redundant. After running both in parallel for a while, we learned we were wrong: - it's far more difficult than we originally anticipated to migrate away from it to the new text formatting system - in particular, the lazy interpolation support in the logging module (and similar systems) has no reasonable migration path - two different core interpolation systems make it much easier to interpolate into format strings - it's a better fit for code which needs to semantically align with C - it's a useful micro-optimisation - as the current discussion shows, it's much better suited to the interpolation of ASCII compatible segments in binary data formats Do many of the core devs strongly prefer the new formatting system? Yes. Were we originally planning to deprecate and remove the printf-style formatting system? Yes. Are there still any plans to do so? No. That's why we rewrote the relevant docs to always describe it as mod formatting or printf-style formatting, rather than legacy or old-style. If there are any instances (or even implications) of the latter left in the official docs, that's a bug to be fixed. Perhaps this needs to be a new Q in my Python 3 QA, since a lot of people still seem to have the wrong idea... Regards, Nick. they now have to change the code back to the way they may well have written it in the first place? I would suggest that people simply .encode the result if bytes are needed in 3.x as well as 2.x. Polyglot code will likely have a 'py3' boolean already to make the encoding conditional. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 16 Jan 2014 17:53, Ethan Furman et...@stoneleaf.us wrote: On 01/15/2014 06:45 AM, Brett Cannon wrote: This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. It may work like this under the hood, but it's an implementation detail. Since the numeric format codes will call int, index, or float on the object (to handle subclasses), we could then call __format__ on the resulting int or float to do the heavy lifting; but since __format__ on anything else would never be called I don't want to give that impression. I have a different proposal: let's *just* add mod formatting to bytes, and leave the extensible formatting system as a text only operation. We don't really care if bytes supports that method for version compatibility purposes, and the deliberate flexibility of the design makes it hard to translate into the binary domain. So let's just not provide that - let's accept that, for the binary domain, printf style formatting is just a better fit for the job :) Cheers, Nick. It also means if you pass in a string that you just want the strict ASCII bytes of then you can get it with {:s}. This isn't going to happen. If the user wants a string to be in the byte stream, it has to either be a bytes literal or explicitly encoded [1]. -- ~Ethan~ [1] Apologies if this has already been answered. I wanted to make sure I responded to all the ideas/objects, and I may have responded more than once to some. It's been a long few threads. ;) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Nick Coghlan wrote: I have a different proposal: let's *just* add mod formatting to bytes, and leave the extensible formatting system as a text only operation. +1 -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman et...@stoneleaf.us wrote: On 01/15/2014 06:45 AM, Brett Cannon wrote: This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. It may work like this under the hood, but it's an implementation detail. I'm arguing it's not an implementation detail but a definition of how bytes.format() would work. Since the numeric format codes will call int, index, or float on the object (to handle subclasses), But that's **only** because the numeric types choose to as part of their __format__() implementation; it is not inherent to str.format(). we could then call __format__ on the resulting int or float to do the heavy lifting; It's not just the heavy lifting; it does **all** the lifting for format specifications. but since __format__ on anything else would never be called I don't want to give that impression. Fine, if you're worried about bytes.format() overstepping by implicitly calling str.encode() on the return value of __format__() then you will need __bytes__format__() to get equivalent support. -Brett It also means if you pass in a string that you just want the strict ASCII bytes of then you can get it with {:s}. This isn't going to happen. If the user wants a string to be in the byte stream, it has to either be a bytes literal or explicitly encoded [1]. -- ~Ethan~ [1] Apologies if this has already been answered. I wanted to make sure I responded to all the ideas/objects, and I may have responded more than once to some. It's been a long few threads. ;) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 4:56 AM, Nick Coghlan ncogh...@gmail.com wrote: On 16 Jan 2014 17:53, Ethan Furman et...@stoneleaf.us wrote: On 01/15/2014 06:45 AM, Brett Cannon wrote: This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. It may work like this under the hood, but it's an implementation detail. Since the numeric format codes will call int, index, or float on the object (to handle subclasses), we could then call __format__ on the resulting int or float to do the heavy lifting; but since __format__ on anything else would never be called I don't want to give that impression. I have a different proposal: let's *just* add mod formatting to bytes, and leave the extensible formatting system as a text only operation. We don't really care if bytes supports that method for version compatibility purposes, and the deliberate flexibility of the design makes it hard to translate into the binary domain. So let's just not provide that - let's accept that, for the binary domain, printf style formatting is just a better fit for the job :) Or PEP 460 for bytes.format() and PEP 461 for %. -Brett Cheers, Nick. It also means if you pass in a string that you just want the strict ASCII bytes of then you can get it with {:s}. This isn't going to happen. If the user wants a string to be in the byte stream, it has to either be a bytes literal or explicitly encoded [1]. -- ~Ethan~ [1] Apologies if this has already been answered. I wanted to make sure I responded to all the ideas/objects, and I may have responded more than once to some. It's been a long few threads. ;) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Greg Ewing greg.ew...@canterbury.ac.nz wrote: Neil Schemenauer wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, I think __ascii_ would be a better name. I'd expect a method called __bytes__ on an int to return some version of its binary value. I realize now we can't use __bytes__. Currently, passing an int to bytes() causes it to construct an object with that many null bytes. If we are going to support format() (I'm not convinced it is nessary and could easily be added in a later version), then we need an equivalent to __format__. My vote is either: def __formatascii__(self, spec): ... or def __ascii__(self, spec): ... Previously I was thinking of __bformat__ or __formatb__ but having ascii in the method name is a great reminder. Objects with a natural arbitrary byte representation can implement __bytes__ and %s should use that if it exists. Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon br...@python.org wrote: Fine, if you're worried about bytes.format() overstepping by implicitly calling str.encode() on the return value of __format__() then you will need __bytes__format__() to get equivalent support. Could we just re-use PEP-3101's note (easily updated for Python 3): Note for Python 2.x: The 'format_spec' argument will be either a string object or a unicode object, depending on the type of the original format string. The __format__ method should test the type of the specifiers parameter to determine whether to return a string or unicode object. It is the responsibility of the __format__ method to return an object of the proper type. If __format__ receives a format_spec of type bytes, it should return bytes. For such cases on objects that cannot support bytes (i.e. for str), it can raise. This appears to avoid the need for additional methods. (As does Nick's proposal of leaving it out for now.) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 11:33 AM, Michael Urman mur...@gmail.com wrote: On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon br...@python.org wrote: Fine, if you're worried about bytes.format() overstepping by implicitly calling str.encode() on the return value of __format__() then you will need __bytes__format__() to get equivalent support. Could we just re-use PEP-3101's note (easily updated for Python 3): Note for Python 2.x: The 'format_spec' argument will be either a string object or a unicode object, depending on the type of the original format string. The __format__ method should test the type of the specifiers parameter to determine whether to return a string or unicode object. It is the responsibility of the __format__ method to return an object of the proper type. If __format__ receives a format_spec of type bytes, it should return bytes. For such cases on objects that cannot support bytes (i.e. for str), it can raise. This appears to avoid the need for additional methods. (As does Nick's proposal of leaving it out for now.) That's a very good catch, Michael! I think that makes sense if there is precedence. Unfortunately that bit from the PEP never made it into the documentation so I'm not sure if there is a backwards-compatibility worry. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/16/2014 06:45 AM, Brett Cannon wrote: On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman wrote: On 01/15/2014 06:45 AM, Brett Cannon wrote: This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. Since the numeric format codes will call int, index, or float on the object (to handle subclasses), But that's **only** because the numeric types choose to as part of their __format__() implementation; it is not inherent to str.format(). As I understand it, str.format will call the object's __format__. So, for example, if I say: u'the value is: %d' % myNum(17) then it will be myNum.__format__ that gets called, not int.__format__; this is precisely what we don't want, since can't know that myNum is only going to return ASCII characters. This is why I would have bytes.__format__, as part of its parsing, call int, index, or float depending on the format code; so the above example would have bytes.__format__ calling int() on myNum(17), at which point we either have an int type or an exception was raised because myNum isn't really an integer. Once we have an int, whose format we know and trust, then we can call its __format__ and proceed from there. On the flip side, if myNum does define it's own __format__, it will not be called by bytes.format, and perhaps that is another good reason for bytes to only support %-interpolation and not format? -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Michael Urman mur...@gmail.com wrote: If __format__ receives a format_spec of type bytes, it should return bytes. For such cases on objects that cannot support bytes (i.e. for str), it can raise. This appears to avoid the need for additional methods. (As does Nick's proposal of leaving it out for now.) That's an interesting idea. I proposed __ascii__ as a analogous method to __format__ for bytes formatting and to have %-interpolation use it. However, overloading __format__ based on the type of the argument could work. I see with Python 3: (1).__format__(b'') Traceback (most recent call last): File stdin, line 1, in module TypeError: must be str, not bytes A TypeError exception is what we want if the object does not support bytes formatting. Some possible problems: - It could be hard to provide a helpful exception message since it is generated inside the __format__ method rather than inside the bytes.__mod__ method (in the case of a missing __ascii__ method). The most common error will be using a str object and so we could modify the __format__ method of str to provide a nice hint (use encode()). - Is there some risk that an object will unwittingly implement a __format__ method that unintentionally accepts a bytes argument? That requires some investigation. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 11:13 AM, Neil Schemenauer n...@arctrix.com wrote: A TypeError exception is what we want if the object does not support bytes formatting. Some possible problems: - It could be hard to provide a helpful exception message since it is generated inside the __format__ method rather than inside the bytes.__mod__ method (in the case of a missing __ascii__ method). The most common error will be using a str object and so we could modify the __format__ method of str to provide a nice hint (use encode()). The various format functions could certainly intercept and wrap exceptions raised by __format__ methods. Once the core types were modified to expect bytes in format_spec, however, this may not be critical; __format__ methods which delegate would work as expected, str could certainly be clear about why it raised, and custom implementations would be handled per comments I'll make on your second point. Overall I suspect this is no worse than unhandled values in the format_spec are today. - Is there some risk that an object will unwittingly implement a __format__ method that unintentionally accepts a bytes argument? That requires some investigation. Agreed. Some quick armchair calculations suggest to me that there are three likely outcomes: - Properly handle the type (perhaps written with the 2.x clause in mind) - Raise an exception internally (perhaps ValueError, such as from format(3, 'q')) - Mishandle and return a str (perhaps due to to if/else defaulting) The first and second outcome may well reflect what we want, and the third could easily be detected and turned into an exception by the format functions. I'm uncertain whether this reflects all the scenarios we would care about. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/16/2014 8:41 AM, Brett Cannon wrote: That's a very good catch, Michael! I think that makes sense if there is precedence. Unfortunately that bit from the PEP never made it into the documentation so I'm not sure if there is a backwards-compatibility worry. No. If __format__ is called with bytes format, and returns str, there would be an exception generated on the spot. If __format__ is called with bytes format, and tries to use it as str, there would be an exception generated on the spot. Prior to 3.whenever-this-is-implemented, Python 3 only provides str formats to __format__, right? So new code is required to pass bytes to __format__. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/16/2014 11:23 AM, Ethan Furman wrote: On 01/16/2014 06:45 AM, Brett Cannon wrote: On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman wrote: On 01/15/2014 06:45 AM, Brett Cannon wrote: This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. Since the numeric format codes will call int, index, or float on the object (to handle subclasses), But that's **only** because the numeric types choose to as part of their __format__() implementation; it is not inherent to str.format(). As I understand it, str.format will call the object's __format__. So, for example, if I say: u'the value is: %d' % myNum(17) then it will be myNum.__format__ that gets called, not int.__format__; this is precisely what we don't want, since can't know that myNum is only going to return ASCII characters. Magic methods, including __format__, are called on the type, not the instance. This is why I would have bytes.__format__, as part of its parsing, call int, index, or float depending on the format code; so the above example would have bytes.__format__ calling int() on myNum(17), at which point we either have an int type or an exception was raised because myNum isn't really an integer. Once we have an int, whose format we know and trust, then we can call its __format__ and proceed from there. On the flip side, if myNum does define it's own __format__, it will not be called by bytes.format, and perhaps that is another good reason for bytes to only support %-interpolation and not format? For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/16/2014 10:30 AM, Eric V. Smith wrote: On 01/16/2014 11:23 AM, Ethan Furman wrote: On 01/16/2014 06:45 AM, Brett Cannon wrote: But that's **only** because the numeric types choose to as part of their __format__() implementation; it is not inherent to str.format(). As I understand it, str.format will call the object's __format__. So, for example, if I say: u'the value is: %d' % myNum(17) then it will be myNum.__format__ that gets called, not int.__format__; this is precisely what we don't want, since can't know that myNum is only going to return ASCII characters. Magic methods, including __format__, are called on the type, not the instance. Yes, that's why I said `myNum(17)` and not `myNum`. This is why I would have bytes.__format__, as part of its parsing, call int, index, or float depending on the format code; so the above example would have bytes.__format__ calling int() on myNum(17), at which point we either have an int type or an exception was raised because myNum isn't really an integer. Once we have an int, whose format we know and trust, then we can call its __format__ and proceed from there. On the flip side, if myNum does define it's own __format__, it will not be called by bytes.format, and perhaps that is another good reason for bytes to only support %-interpolation and not format? For the first iteration of bytes.format(), I think we should just support the exact types of int, float, and bytes. It will call the type's__format__ (with the object as self) and encode the result to ASCII. For the stated use case of 2.x compatibility, I suspect this will cover 90% of the uses in real code. If we find there are cases where real code needs additional types supported, we can consider adding __format_ascii__ (or whatever name we cook up). That can certainly be our fallback position if we can't decide now how we want to handle int and float subclasses. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/16/2014 01:55 PM, Ethan Furman wrote: Magic methods, including __format__, are called on the type, not the instance. Yes, that's why I said `myNum(17)` and not `myNum`. Oops, apologies. I misread the code. Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
16.01.2014 17:33, Michael Urman wrote: On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon br...@python.org wrote: Fine, if you're worried about bytes.format() overstepping by implicitly calling str.encode() on the return value of __format__() then you will need __bytes__format__() to get equivalent support. Could we just re-use PEP-3101's note (easily updated for Python 3): Note for Python 2.x: The 'format_spec' argument will be either a string object or a unicode object, depending on the type of the original format string. The __format__ method should test the type of the specifiers parameter to determine whether to return a string or unicode object. It is the responsibility of the __format__ method to return an object of the proper type. If __format__ receives a format_spec of type bytes, it should return bytes. For such cases on objects that cannot support bytes (i.e. for str), it can raise. This appears to avoid the need for additional methods. (As does Nick's proposal of leaving it out for now.) -1. I'd treat the format()+.__format__()+str.format()-ecosystem as a nice text-data-oriented, *complete* Py3k feature, backported to Python 2 to share the benefits of the feature with it as well as to make the 2-to-3 transition a bit easier. IMHO, the PEP-3101's note cited above just describes a workaround over the flaws of the Py2's obsolete text model. Moving such complications into Py3k would make the feature (and especially the ability to implement your own .__format__()) harder to understand and make use of -- for little profit. Such a move is not needed for compatibility. And, IMHO, the format()/__format__()/str.format()-matter is all about nice and flexible *text* formatting, not about binary data interpolation. 16.01.2014 10:56, Nick Coghlan wrote: I have a different proposal: let's *just* add mod formatting to bytes, and leave the extensible formatting system as a text only operation. We don't really care if bytes supports that method for version compatibility purposes, and the deliberate flexibility of the design makes it hard to translate into the binary domain. So let's just not provide that - let's accept that, for the binary domain, printf style formatting is just a better fit for the job :) +1! However, I am not sure if %s should be limited to bytes-like objects. As practicality beats purity, I would be +0.5 for enabling the following: - input type supports Py_buffer? use it to collect the necessary bytes - input type has the __bytes__() method? use it to collect the necessary bytes - input type has the encode() method? raise TypeError - otherwise: use something equivalent to ascii(obj).encode('ascii') (note that it would nicely format numbers + format other object in more-or-less useful way without the fear of encountering a non-ascii data). another option: use str()-representation of strictly defined types, e.g.: int, float, decimal.Decimal, fractions.Fraction... Cheers. *j ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 08:23:13AM -0800, Ethan Furman wrote: As I understand it, str.format will call the object's __format__. So, for example, if I say: u'the value is: %d' % myNum(17) then it will be myNum.__format__ that gets called, not int.__format__; I seem to have missed something, because I am completely confused... Why are you talking about str.format and then show an example using % instead? %d calls __str__, not __format__. This is in Python 3.3: py class MyNum(int): ... def __str__(self): ... print(Calling MyNum.__str__) ... return super().__str__() ... def __format__(self): ... print(Calling MyNum.__format__) ... return super().__format__() ... py n = MyNum(17) py u%d % n Calling MyNum.__str__ '17' By analogy, if we have a bytes %d formatting, surely it should either: (1) call type(n).__bytes__(n), which is guaranteed to raise if the result isn't ASCII (i.e. like len() raises if the result isn't an int); or (2) call type(n).__str__(n).encode(ascii, strict). Personally, I lean towards (2), even though that means you can't have a single class provide an ASCII string to b'%d' and a non-ASCII string to u'%d'. this is precisely what we don't want, since can't know that myNum is only going to return ASCII characters. It seems to me that Consenting Adults applies here. If class MyNum returns a non-ASCII string, then you ought to get a runtime exception, exactly the same as happens with just about every other failure in Python. If you don't want that possible exception, then don't use MyNum, or explicitly wrap it in a call to int: b'the value is: %d' % int(MyNum(17)) The *worst* solution would be to completely ignore MyNum.__str__. That's a nasty violation of the Principle Of Least Surprise, and will lead to confusion (why isn't my class' __str__ method being called?) and bugs. * Explicit is better than implicit -- better to explicitly wrap MyNum in a call to int() than to have bytes %d automagically do it for you; * Special cases aren't special enough to break the rules -- bytes %d isn't so special that standard Python rules about calling special methods should be ignored; * Errors should never pass silently -- if MyNum does the wrong thing when used with bytes %d, you should get an exception. This is why I would have bytes.__format__, as part of its parsing, call int, index, or float depending on the format code; so the above example would have bytes.__format__ calling int() on myNum(17), The above example you give doesn't have any bytes in it. Can you explain what you meant to say? I'm guessing you intended this: b'the value is: %d' % MyNum(17) rather than using u'' as actually given, but I don't really know. -- Steven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
bytes.format() below. I'll leave it to you to decide if they warrant using, leaving as an open question, or rejecting. On Tue, Jan 14, 2014 at 2:56 PM, Ethan Furman et...@stoneleaf.us wrote: Duh. Here's the text, as well. ;) PEP: 461 Title: Adding % and {} formatting to bytes Version: $Revision$ Last-Modified: $Date$ Author: Ethan Furman et...@stoneleaf.us Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-01-13 Python-Version: 3.5 Post-History: 2014-01-13 Resolution: Abstract This PEP proposes adding the % and {} formatting operations from str to bytes. Proposed semantics for bytes formatting === %-interpolation --- All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.) will be supported, and will work as they do for str, including the padding, justification and other related modifiers. Example:: b'%4x' % 10 b' a' %c will insert a single byte, either from an int in range(256), or from a bytes argument of length 1. Example: b'%c' % 48 b'0' b'%c' % b'a' b'a' %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) - input type is something else? use its __bytes__ method; if there isn't one, raise an exception [3] Examples: b'%s' % b'abc' b'abc' b'%s' % 3.14 b'3.14' b'%s' % 'hello world!' Traceback (most recent call last): ... TypeError: 'hello world' has no __bytes__ method, perhaps you need to encode it? .. note:: Because the str type does not have a __bytes__ method, attempts to directly use 'a string' as a bytes interpolation value will raise an exception. To use 'string' values, they must be encoded or otherwise transformed into a bytes sequence:: 'a string'.encode('latin-1') format -- The format mini language will be used as-is, with the behaviors as listed for %-interpolation. That's too vague; % interpolation does not support other format operators in the same way as str.format() does. % interpolation has specific code to support %d, etc. But str.format() gets supported for {:d} not from special code but because e.g. float.__format__('d') works. So you can't say bytes.format() supports {:d} just like %d works with string interpolation since the mechanisms are fundamentally different. This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. It also means if you pass in a string that you just want the strict ASCII bytes of then you can get it with {:s}. I also think that a 'b' conversion be added to bytes.format(). This doesn't have the same issue as %b if you make {} implicitly mean {!b} in Python 3.5 as {} will mean what is the most accurate for bytes.format() in either version. It also allows for explicit support where you know you only want a byte and allows {!s} to mean you only want a string (and thus throw an error otherwise). And all of this means that much like %s only taking bytes, the only way for bytes.format() to accept a non-byte argument is for some format spec to be specified to trigger the .encode('ascii', 'strict') call. -Brett Open Questions == For %s there has been some discussion of trying to use the buffer protocol (Py_buffer) before trying __bytes__. This question should be answered before the PEP is implemented. Proposed variations === It has been suggested to use %b for bytes instead of %s. - Rejected as %b does not exist in Python 2.x %-interpolation, which is why we are using %s. It has been proposed to automatically use .encode('ascii','strict') for str arguments to %s. - Rejected as this would lead to intermittent failures. Better to have the operation always fail so the trouble-spot can be correctly fixed. It has been proposed to have %s return the ascii-encoded repr when the value is a str (b'%s' % 'abc' -- b'abc'). - Rejected as this would lead to hard to debug failures far from the problem site. Better to have the operation always fail so the trouble-spot can be easily fixed. Foot notes == .. [1] Not sure if this should be the numeric __str__ or the numeric __repr__, or if there's any difference .. [2] Any proper numeric class would then have to provide an ascii representation of its value, either via __repr__ or __str__ (whichever we choose in [1]). .. [3] TypeError, ValueError, or UnicodeEncodeError? Copyright = This document has been placed in the public domain. .. Local
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
This looks pretty good to me. I don't think we should limit operands based on type, that's anti-Pythonic IMHO. We should use duck-typing and that means a special method, I think. We could introduce a new one but __bytes__ looks like it can work. Otherwise, maybe __ascii__ is a good name. Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, no matter what the *value* (we don't want to repeat the hell of Python 2's unicode to str coercion which depends on the value of the unicode object). Objects that already contain encoded bytes or arbitrary bytes can also implement __bytes__. Ethan Furman et...@stoneleaf.us wrote: %s, because it is the most general, has the most convoluted resolution: This becomes much simpler: - does the object implement __bytes__? call it and use the value otherwise raise TypeError It has been suggested to use %b for bytes instead of %s. - Rejected as %b does not exist in Python 2.x %-interpolation, which is why we are using %s. +1. %b might be conceptually neater but ease of migration trumps that, IMHO. It has been proposed to automatically use .encode('ascii','strict') for str arguments to %s. - Rejected as this would lead to intermittent failures. Better to have the operation always fail so the trouble-spot can be correctly fixed. Right. That would put us back in Python 2 unicode - str coercion hell. Thanks for writing the PEP. Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/15/2014 9:45 AM, Brett Cannon wrote: That's too vague; % interpolation does not support other format operators in the same way as str.format() does. % interpolation has specific code to support %d, etc. But str.format() gets supported for {:d} not from special code but because e.g. float.__format__('d') works. So you can't say bytes.format() supports {:d} just like %d works with string interpolation since the mechanisms are fundamentally different. This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. It also means if you pass in a string that you just want the strict ASCII bytes of then you can get it with {:s}. I also think that a 'b' conversion be added to bytes.format(). This doesn't have the same issue as %b if you make {} implicitly mean {!b} in Python 3.5 as {} will mean what is the most accurate for bytes.format() in either version. It also allows for explicit support where you know you only want a byte and allows {!s} to mean you only want a string (and thus throw an error otherwise). And all of this means that much like %s only taking bytes, the only way for bytes.format() to accept a non-byte argument is for some format spec to be specified to trigger the .encode('ascii', 'strict') call. Agreed. With %-formatting, you can start with the format strings and then decide what you want to do with the passed in objects. But with .format, it's the other way around: you have to look at the passed in objects being formatted, and then decide what the format specifier means to that type. So, for .format, you could say hey, that object's an int, and I happen to know how to format ints, outside of calling it's .__format__. Or you could even call its __format__ because you know that it will only be ASCII. But to take this approach, you're going to have to hard-code the types. And subclasses are probably out, since there you don't know what the subclass's __format__ will return. It could be non-ASCII. class Int(int): ... def __format__(self, fmt): ... return u'foo' ... '{}'.format(Int(3)) 'foo' So basically I think we'll have to hard-code the types that .format() will support, and never call __format__, or only call __format__ if we know that it's a exact type where we know that __format__ will return (strict ASCII). Either that, or we're back to encoding the result of __format__ and accepting that sometimes it might throw errors, depending on the values being passed into format(). Eric. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/15/2014 06:45 AM, Brett Cannon wrote: bytes.format() below. I'll leave it to you to decide if they warrant using, leaving as an open question, or rejecting. Thanks for your comments. I've only barely touched format, so it's not an area of strength for me. :) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Wed, 15 Jan 2014 15:47:43 + (UTC) Neil Schemenauer n...@arctrix.com wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, no matter what the *value* I think that's a slippery slope. __bytes__ should mean that the object has a well-known bytes equivalent or encoding, not that its __str__ happens to be pure ASCII. (for example, it would be fine for a HTTP message class to define a __bytes__ method) Also, consider that if e.g. float had a __bytes__ method, then bytes(2.0) would start returning b'2.0', while bytes(2) would still need to return b'\x00\x00'. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Neil Schemenauer n...@arctrix.com wrote: We should use duck-typing and that means a special method, I think. We could introduce a new one but __bytes__ looks like it can work. Otherwise, maybe __ascii__ is a good name. I poked around the Python 3 source. Using __bytes__ has some downsides, e.g. the following would happen: bytes(12) b'12' Perhaps that's a little too ASCII-centric. OTOH, UTF-8 seems to be winning the encoding war and so the above could be argued as reasonable behavior. I think forcing people to explicitly choose an encoding for str objects will be sufficient to avoid the bytes/str mess we have in Python 2. Unfortunately, that change conflicts with the current behavior: bytes(12) b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' Would it be too disruptive to change that? It doesn't appear to be too useful and we could do it using a keyword argument, e.g.: bytes(size=12) I notice something else surprising to me: class Test(object): ... def __bytes__(self): ... return b'test' ... with open('test', 'wb') as fp: ... fp.write(Test()) ... Traceback (most recent call last): File stdin, line 2, in module TypeError: 'Test' does not support the buffer interface I'd expect that to write b'test' to the file, not raise an error. Regards, Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/14/2014 02:41 PM, Mark Lawrence wrote: On 14/01/2014 19:55, Ethan Furman wrote: This PEP goes a but further than PEP 460 does, and hopefully spells things out in enough detail so there is no confusion as to what is meant. -- ~Ethan~ Out of plain old curiosity do we have to consider PEP 292 string templates in any way, shape or form, or regarding this debate have they been safely booted into touch? Well, I'm not sure what booted into touch means, but yes, we can ignore string templates. :) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/15/2014 08:04 AM, Antoine Pitrou wrote: On Wed, 15 Jan 2014 15:47:43 + (UTC) Neil Schemenauer n...@arctrix.com wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, no matter what the *value* I think that's a slippery slope. __bytes__ should mean that the object has a well-known bytes equivalent or encoding, not that its __str__ happens to be pure ASCII. Agreed. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/15/2014 07:47 AM, Neil Schemenauer wrote: Thanks for writing the PEP. Thank you for your comments! -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 14/01/2014 19:56, Ethan Furman wrote: Duh. Here's the text, as well. ;) %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) - input type is something else? use its __bytes__ method; if there isn't one, raise an exception [3] Examples: b'%s' % b'abc' b'abc' b'%s' % 3.14 b'3.14' b'%s' % 'hello world!' Traceback (most recent call last): ... TypeError: 'hello world' has no __bytes__ method, perhaps you need to encode it? For completeness I believe %r and %a should be included here as well. FTR %a appears to have been introduced in 3.2, but I couldn't find anything in the What's New and there's no note in the docs http://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting to indicate when it first came into play. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Wed, Jan 15, 2014 at 10:52 AM, Eric V. Smith e...@trueblade.com wrote: On 1/15/2014 9:45 AM, Brett Cannon wrote: That's too vague; % interpolation does not support other format operators in the same way as str.format() does. % interpolation has specific code to support %d, etc. But str.format() gets supported for {:d} not from special code but because e.g. float.__format__('d') works. So you can't say bytes.format() supports {:d} just like %d works with string interpolation since the mechanisms are fundamentally different. This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. It also means if you pass in a string that you just want the strict ASCII bytes of then you can get it with {:s}. I also think that a 'b' conversion be added to bytes.format(). This doesn't have the same issue as %b if you make {} implicitly mean {!b} in Python 3.5 as {} will mean what is the most accurate for bytes.format() in either version. It also allows for explicit support where you know you only want a byte and allows {!s} to mean you only want a string (and thus throw an error otherwise). And all of this means that much like %s only taking bytes, the only way for bytes.format() to accept a non-byte argument is for some format spec to be specified to trigger the .encode('ascii', 'strict') call. Agreed. With %-formatting, you can start with the format strings and then decide what you want to do with the passed in objects. But with .format, it's the other way around: you have to look at the passed in objects being formatted, and then decide what the format specifier means to that type. So, for .format, you could say hey, that object's an int, and I happen to know how to format ints, outside of calling it's .__format__. Or you could even call its __format__ because you know that it will only be ASCII. But to take this approach, you're going to have to hard-code the types. And subclasses are probably out, since there you don't know what the subclass's __format__ will return. It could be non-ASCII. class Int(int): ... def __format__(self, fmt): ... return u'foo' ... '{}'.format(Int(3)) 'foo' So basically I think we'll have to hard-code the types that .format() will support, and never call __format__, or only call __format__ if we know that it's a exact type where we know that __format__ will return (strict ASCII). Either that, or we're back to encoding the result of __format__ and accepting that sometimes it might throw errors, depending on the values being passed into format(). I say accept that an error might get thrown as there is precedent of specifying a format spec that an object's __format__() method doesn't recognize:: '{:s}'.format(1) Traceback (most recent call last): File stdin, line 1, in module ValueError: Unknown format code 's' for object of type 'int' IOW I'm actively trying to avoid type-restricting the semantics for bytes.format() for a consistent, clear mental model. Remembering that any format spec leads to calling .encode('ascii', 'strict') on the result is simple compared to ASCII bytes will be returned for ints and floats when passed in, otherwise all other types follow these rules. As the zen says: Errors should never pass silently. Special cases aren't special enough to break the rules. -Brett ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Wed, Jan 15, 2014 at 10:57 AM, Ethan Furman et...@stoneleaf.us wrote: On 01/15/2014 06:45 AM, Brett Cannon wrote: bytes.format() below. I'll leave it to you to decide if they warrant using, leaving as an open question, or rejecting. Thanks for your comments. I've only barely touched format, so it's not an area of strength for me. :) Time to strengthen it if you are proposing a PEP that is going to affect it. =) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/15/2014 08:51 AM, Brett Cannon wrote: On Wed, Jan 15, 2014 at 10:57 AM, Ethan Furman wrote: Thanks for your comments. I've only barely touched format, so it's not an area of strength for me. :) Time to strengthen it if you are proposing a PEP that is going to affect it. =) I am. You're helping. :) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Wed, 15 Jan 2014, Antoine Pitrou wrote: On Wed, 15 Jan 2014 15:47:43 + (UTC) Neil Schemenauer n...@arctrix.com wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, no matter what the *value* I think that's a slippery slope. __bytes__ should mean that the object has a well-known bytes equivalent or encoding, not that its __str__ happens to be pure ASCII. +1 (for example, it would be fine for a HTTP message class to define a __bytes__ method) Also, consider that if e.g. float had a __bytes__ method, then bytes(2.0) would start returning b'2.0', while bytes(2) would still need to return b'\x00\x00'. Not actually suggesting the following for a number of reasons including but not limited to the consistency of floating point formats across different implementations, but it would make more sense for bytes (2.0) to return the 8-byte IEEE representation than for it to return the ASCII encoding of the decimal representation of the number. Isaac Morland CSCF Web Guru DC 2619, x36650 WWW Software Specialist ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Antoine Pitrou solip...@pitrou.net wrote: On Wed, 15 Jan 2014 15:47:43 + (UTC) Neil S wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, no matter what the *value* I think that's a slippery slope. __bytes__ should mean that the object has a well-known bytes equivalent or encoding, not that its __str__ happens to be pure ASCII. After poking around some more into the Python 3 source, I agree. It seems too late to change bytes(integer) and bytearray(integer). We should have used a keyword only argument but too late now (tp_new is a mess). I can also agree that pushing the ASCII-centric behavior into the bytes() constructor goes too far. If we limit the ASCII-centric behavior solely to % and format(), that seems a reasonable trade-off for usability. As others have argued, once you are using format codes, you are pretty clearly dealing with ASCII encoding. I feel strongly that % and format on bytes needs to use duck-typing and not type checking. Also, formatting falures must be due to types and not due to values. If we can get agreement on these two principles, that will help guide the design. Those principles absolutely rule out call calling encode('ascii') automatically. I'm not deeply intimate with format() but I think it also rules out calling __format__. Could we introduce only __bformat__ and have the % operator call it? That would only require implementing one new special method instead of two. Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
All sounds good. A fleeting thought about constructors: you can always add alternative constructors as class methods (like datetime does). On Wed, Jan 15, 2014 at 10:09 AM, Neil Schemenauer n...@arctrix.com wrote: Antoine Pitrou solip...@pitrou.net wrote: On Wed, 15 Jan 2014 15:47:43 + (UTC) Neil S wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, no matter what the *value* I think that's a slippery slope. __bytes__ should mean that the object has a well-known bytes equivalent or encoding, not that its __str__ happens to be pure ASCII. After poking around some more into the Python 3 source, I agree. It seems too late to change bytes(integer) and bytearray(integer). We should have used a keyword only argument but too late now (tp_new is a mess). I can also agree that pushing the ASCII-centric behavior into the bytes() constructor goes too far. If we limit the ASCII-centric behavior solely to % and format(), that seems a reasonable trade-off for usability. As others have argued, once you are using format codes, you are pretty clearly dealing with ASCII encoding. I feel strongly that % and format on bytes needs to use duck-typing and not type checking. Also, formatting falures must be due to types and not due to values. If we can get agreement on these two principles, that will help guide the design. Those principles absolutely rule out call calling encode('ascii') automatically. I'm not deeply intimate with format() but I think it also rules out calling __format__. Could we introduce only __bformat__ and have the % operator call it? That would only require implementing one new special method instead of two. Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/15/2014 7:52 AM, Eric V. Smith wrote: So basically I think we'll have to hard-code the types that .format() will support, and never call __format__, or only call __format__ if we know that it's a exact type where we know that __format__ will return (strict ASCII). Either that, or we're back to encoding the result of __format__ and accepting that sometimes it might throw errors, depending on the values being passed into format(). Looks like you need to invent __formatb__ to produce only ASCII. Objects that have __formatb__ can be formatted by bytes.format. To avoid coding, it could be possible that __formatb__ might be a callable, in which case it is called to get the result, or not a callable, in which case one calls __format__ and converts the result to ASCII, __formatb__ just indicating a guarantee that only ASCII will result. Or it could be that __formatb__ replaces __format__ and str.__format__, if it finds no __format__ looks for __formatb__, calls that, and converts the result to Unicode. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Glenn Linderman v+pyt...@g.nevcal.com wrote: On 1/15/2014 7:52 AM, Eric V. Smith wrote: Either that, or we're back to encoding the result of __format__ and accepting that sometimes it might throw errors, depending on the values being passed into format(). That would take us back to Python 2 hell. Please no. I don't like checking for types either, we should have a special method. Looks like you need to invent __formatb__ to produce only ASCII. Objects that have __formatb__ can be formatted by bytes.format. To avoid coding, it could be possible that __formatb__ might be a callable in which case it is called to get the result, or not a callable, in which case one calls __format__ and converts the result to ASCII, __formatb__ just indicating a guarantee that only ASCII will result. Just do: def __formatb__(self, spec): return MyClass.__format__(self, spec).encode('ascii') Note that I think it is better to explicitly use the __format__ method rather than using self.__format__. My reasoning is that a subclass might implement a __format__ that returns non-ASCII characters. We don't need a special bytes version of __str__ since the %-operator can call __formatb__ with the correct format spec. Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/15/2014 06:45 AM, Brett Cannon wrote: I also think that a 'b' conversion be added to bytes.format(). This doesn't have the same issue as %b if you make {} implicitly mean {!b} in Python 3.5 as {} will mean what is the most accurate for bytes.format() in either version. It also allows for explicit support where you know you only want a byte and allows {!s} to mean you only want a string (and thus throw an error otherwise). Given that !b does not exist in Py2, !s (like %s) has to mean bytes when working with a byte stream. Given that, !s and !b would mean the same thing, so it worth adding !b? -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/15/2014 06:45 AM, Brett Cannon wrote: The PEP currently says:: format -- The format mini language will be used as-is, with the behaviors as listed for %-interpolation. That's too vague; % interpolation does not support other format operators in the same way as str.format() does. % interpolation has specific code to support %d, etc. But str.format() gets supported for {:d} not from special code but because e.g. float.__format__('d') works. So you can't say bytes.format() supports {:d} just like %d works with string interpolation since the mechanisms are fundamentally different. A question for anyone that has extensive experience in both %-formatting and .format-formatting: Would it be possible, at least for int and float, to take whatever is in the specifier and convert to %? Example: Weight: {wgt:-07f}.format(wgt=137.23) would take the -07f and basically do a %-07f % 137.23 to get the ASCII to use? -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Neil Schemenauer wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, I think __ascii_ would be a better name. I'd expect a method called __bytes__ on an int to return some version of its binary value. -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Thu, Jan 16, 2014 at 10:55:31AM +1300, Greg Ewing wrote: Neil Schemenauer wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, I think __ascii_ would be a better name. I'd expect a method called __bytes__ on an int to return some version of its binary value. +1 -- Steven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/15/2014 4:32 PM, Ethan Furman wrote: A question for anyone that has extensive experience in both %-formatting and .format-formatting: Would it be possible, at least for int and float, to take whatever is in the specifier and convert to %? Example: Weight: {wgt:-07f}.format(wgt=137.23) would take the -07f and basically do a %-07f % 137.23 to get the ASCII to use? I think the int.__format__ version might be a superset. Specifically, the n and % types. There may well be others. But I think we could say we're not going to support these in b.format(). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Wed, Jan 15, 2014 at 4:24 PM, Ethan Furman et...@stoneleaf.us wrote: On 01/15/2014 06:45 AM, Brett Cannon wrote: I also think that a 'b' conversion be added to bytes.format(). This doesn't have the same issue as %b if you make {} implicitly mean {!b} in Python 3.5 as {} will mean what is the most accurate for bytes.format() in either version. It also allows for explicit support where you know you only want a byte and allows {!s} to mean you only want a string (and thus throw an error otherwise). Given that !b does not exist in Py2, !s (like %s) has to mean bytes when working with a byte stream. Given that, !s and !b would mean the same thing, so it worth adding !b? I disagree with the assertion. %s has to mean bytes for Python 2 compatibility because there is no equivalent to '{}' (no conversion or format spec specified); basically %s represents no conversion for the % operator. But since format() has the concept of a default conversion as well as explicit conversions you can lean on that fact and let the default conversion do what makes sense for that version of Python. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Wed, Jan 15, 2014 at 5:00 PM, Steven D'Aprano st...@pearwood.infowrote: On Thu, Jan 16, 2014 at 10:55:31AM +1300, Greg Ewing wrote: Neil Schemenauer wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, I think __ascii_ would be a better name. I'd expect a method called __bytes__ on an int to return some version of its binary value. +1 If we are going the route of a new magic method then __ascii__ or __bytes_format__ get my vote as long as they only return bytes (I see no need to abbreviate to __bformat__ or __formatb__ when we have method names as long as __text_signature__ now). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/15/2014 08:33 AM, Mark Lawrence wrote: For completeness I believe %r and %a should be included here as well. Good point. Done. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 15/01/2014 22:22, Brett Cannon wrote: On Wed, Jan 15, 2014 at 5:00 PM, Steven D'Aprano st...@pearwood.info mailto:st...@pearwood.info wrote: On Thu, Jan 16, 2014 at 10:55:31AM +1300, Greg Ewing wrote: Neil Schemenauer wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, I think __ascii_ would be a better name. I'd expect a method called __bytes__ on an int to return some version of its binary value. +1 If we are going the route of a new magic method then __ascii__ or __bytes_format__ get my vote as long as they only return bytes (I see no need to abbreviate to __bformat__ or __formatb__ when we have method names as long as __text_signature__ now). __bytes_format__ gets my vote as it's blatantly obvious what it does. I'm against __ascii__ as I'd automatically associate that with ascii in the same way that I associate str with __str__ and repr with __repr__. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Wed, Jan 15, 2014 at 10:34:48PM +, Mark Lawrence wrote: On 15/01/2014 22:22, Brett Cannon wrote: On Wed, Jan 15, 2014 at 5:00 PM, Steven D'Aprano st...@pearwood.info mailto:st...@pearwood.info wrote: On Thu, Jan 16, 2014 at 10:55:31AM +1300, Greg Ewing wrote: Neil Schemenauer wrote: Objects that implement __str__ can also implement __bytes__ if they can guarantee that ASCII characters are always returned, I think __ascii_ would be a better name. I'd expect a method called __bytes__ on an int to return some version of its binary value. +1 If we are going the route of a new magic method then __ascii__ or __bytes_format__ get my vote as long as they only return bytes (I see no need to abbreviate to __bformat__ or __formatb__ when we have method names as long as __text_signature__ now). __bytes_format__ gets my vote as it's blatantly obvious what it does. What precisely does it do? If it's so obvious, why is this thread so long? I'm against __ascii__ as I'd automatically associate that with ascii in the same way that I associate str with __str__ and repr with __repr__. That's a good point. I forgot about ascii(). -- Steven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/15/2014 4:03 PM, Steven D'Aprano wrote: What precisely does it do? If it's so obvious, why is this thread so long? It produces a formatted representation of the object in bytes. For numbers, that would probably be expected to be ASCII digits and punctuation. But other items are not as obvious. bytes would probably be expected not to have a __bytes_format__, but if a subclass defined one, it might be HEX or Base64 of the base bytes. Or if the subclass is ASCII text oriented, it might be the ASCII text version of the base bytes (which would be identical to the base bytes, except for the type transformation). str would probably be expected not to have a __bytes_format__, but if a subclass defined one, it might be HEX or Base64, or it might be a specific encoding of the base str. Other objects might generate an ASCII __repr__, if they define the method. It took a lot of talk to reach the conclusion, if it has been reached, that none of the solution are general enough without defining something like __bytes_format__. And before that, a lot of talk to decide that % interpolation already had an ASCII bias. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Wed, Jan 15, 2014 at 05:46:07PM -0800, Glenn Linderman wrote: On 1/15/2014 4:03 PM, Steven D'Aprano wrote: What precisely does it do? If it's so obvious, why is this thread so long? It produces a formatted representation of the object in bytes. For numbers, that would probably be expected to be ASCII digits and punctuation. But other items are not as obvious. My point exactly. -- Steven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Ethan Furman wrote: Well, I'm not sure what booted into touch means, It's a rugby term, referring to kicking the ball over the touch line. As a metaphor, it seems to mean making a problem go away. -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/15/2014 06:45 AM, Brett Cannon wrote: This is why I have argued that if you specify it as if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it you get the support for the various number-specific format specs for free. It may work like this under the hood, but it's an implementation detail. Since the numeric format codes will call int, index, or float on the object (to handle subclasses), we could then call __format__ on the resulting int or float to do the heavy lifting; but since __format__ on anything else would never be called I don't want to give that impression. It also means if you pass in a string that you just want the strict ASCII bytes of then you can get it with {:s}. This isn't going to happen. If the user wants a string to be in the byte stream, it has to either be a bytes literal or explicitly encoded [1]. -- ~Ethan~ [1] Apologies if this has already been answered. I wanted to make sure I responded to all the ideas/objects, and I may have responded more than once to some. It's been a long few threads. ;) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
Duh. Here's the text, as well. ;) PEP: 461 Title: Adding % and {} formatting to bytes Version: $Revision$ Last-Modified: $Date$ Author: Ethan Furman et...@stoneleaf.us Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-01-13 Python-Version: 3.5 Post-History: 2014-01-13 Resolution: Abstract This PEP proposes adding the % and {} formatting operations from str to bytes. Proposed semantics for bytes formatting === %-interpolation --- All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.) will be supported, and will work as they do for str, including the padding, justification and other related modifiers. Example:: b'%4x' % 10 b' a' %c will insert a single byte, either from an int in range(256), or from a bytes argument of length 1. Example: b'%c' % 48 b'0' b'%c' % b'a' b'a' %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) - input type is something else? use its __bytes__ method; if there isn't one, raise an exception [3] Examples: b'%s' % b'abc' b'abc' b'%s' % 3.14 b'3.14' b'%s' % 'hello world!' Traceback (most recent call last): ... TypeError: 'hello world' has no __bytes__ method, perhaps you need to encode it? .. note:: Because the str type does not have a __bytes__ method, attempts to directly use 'a string' as a bytes interpolation value will raise an exception. To use 'string' values, they must be encoded or otherwise transformed into a bytes sequence:: 'a string'.encode('latin-1') format -- The format mini language will be used as-is, with the behaviors as listed for %-interpolation. Open Questions == For %s there has been some discussion of trying to use the buffer protocol (Py_buffer) before trying __bytes__. This question should be answered before the PEP is implemented. Proposed variations === It has been suggested to use %b for bytes instead of %s. - Rejected as %b does not exist in Python 2.x %-interpolation, which is why we are using %s. It has been proposed to automatically use .encode('ascii','strict') for str arguments to %s. - Rejected as this would lead to intermittent failures. Better to have the operation always fail so the trouble-spot can be correctly fixed. It has been proposed to have %s return the ascii-encoded repr when the value is a str (b'%s' % 'abc' -- b'abc'). - Rejected as this would lead to hard to debug failures far from the problem site. Better to have the operation always fail so the trouble-spot can be easily fixed. Foot notes == .. [1] Not sure if this should be the numeric __str__ or the numeric __repr__, or if there's any difference .. [2] Any proper numeric class would then have to provide an ascii representation of its value, either via __repr__ or __str__ (whichever we choose in [1]). .. [3] TypeError, ValueError, or UnicodeEncodeError? Copyright = This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 461 - Adding % and {} formatting to bytes
This PEP goes a but further than PEP 460 does, and hopefully spells things out in enough detail so there is no confusion as to what is meant. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Tue, 14 Jan 2014 11:56:25 -0800 Ethan Furman et...@stoneleaf.us wrote: %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through It should try to get a Py_buffer instead. - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) What is the definition of numeric? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Tue, Jan 14, 2014 at 2:55 PM, Ethan Furman et...@stoneleaf.us wrote: This PEP goes a but further than PEP 460 does, and hopefully spells things out in enough detail so there is no confusion as to what is meant. Are we going down the PEP route with the various ideas? Guido, do you want one from me as well or should I not bother? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/14/2014 01:05 PM, Brett Cannon wrote: On Tue, Jan 14, 2014 at 2:55 PM, Ethan Furman wrote: This PEP goes a but further than PEP 460 does, and hopefully spells things out in enough detail so there is no confusion as to what is meant. Are we going down the PEP route with the various ideas? Guido, do you want one from me as well or should I not bother? While I can't answer for Guido, I will say I authored this PEP because Antoine didn't want 460 to be any more liberal than it already was. If you collect your ideas together, I'll add them to 461 as questions or discussions or however is appropriate (assuming you're willing to go that route). -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/14/2014 12:57 PM, Antoine Pitrou wrote: On Tue, 14 Jan 2014 11:56:25 -0800 Ethan Furman et...@stoneleaf.us wrote: %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through It should try to get a Py_buffer instead. Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and this should be the first thing we try? Sounds good. For that matter, should the first test be does this object support Py_buffer and not worry about it being isinstance(obj, bytes)? - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) What is the definition of numeric? That is a key question. Obviously we have int, float, and complex. We also have Decimal. But what about Fraction? Or some users numeric class that doesn't inherit from a core numeric type? Wherever we draw the line, we need to make it's well-documented. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On January 14, 2014 at 4:36:00 PM, Ethan Furman (et...@stoneleaf.us) wrote: On 01/14/2014 12:57 PM, Antoine Pitrou wrote: On Tue, 14 Jan 2014 11:56:25 -0800 Ethan Furman wrote: %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through It should try to get a Py_buffer instead. Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and this should be the first thing we try? Sounds good. For that matter, should the first test be does this object support Py_buffer and not worry about it being isinstance(obj, bytes)? - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) What is the definition of numeric? That is a key question. isinstance(o, numbers.Number) ? Yury ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On Tue, 14 Jan 2014 13:07:57 -0800 Ethan Furman et...@stoneleaf.us wrote: Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and this should be the first thing we try? Sounds good. For that matter, should the first test be does this object support Py_buffer and not worry about it being isinstance(obj, bytes)? Yes, unless the implementation wants to micro-optimize stuff. - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) What is the definition of numeric? That is a key question. Obviously we have int, float, and complex. We also have Decimal. The question is also how do you test for them? Decimal is not a core builtin type. Do we need some kind of __bformat__ protocol? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 15 Jan 2014 07:36, Ethan Furman et...@stoneleaf.us wrote: On 01/14/2014 12:57 PM, Antoine Pitrou wrote: On Tue, 14 Jan 2014 11:56:25 -0800 Ethan Furman et...@stoneleaf.us wrote: %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through It should try to get a Py_buffer instead. Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and this should be the first thing we try? Sounds good. For that matter, should the first test be does this object support Py_buffer and not worry about it being isinstance(obj, bytes)? Yep. I actually suggest adjusting the %s handling to: - interpolate Py_buffer exporters directly - interpolate __bytes__ if defined - reject anything with an encode method - otherwise interpolate str(obj).encode(ascii) - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) What is the definition of numeric? That is a key question. As suggested above, I would flip the question and explicitly *disallow* implicit encoding of any object with its own encode method, while allowing everything else. Cheers, Nick. Obviously we have int, float, and complex. We also have Decimal. But what about Fraction? Or some users numeric class that doesn't inherit from a core numeric type? Wherever we draw the line, we need to make it's well-documented. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/14/2014 02:17 PM, Nick Coghlan wrote: On 15 Jan 2014 07:36, Ethan Furman et...@stoneleaf.us mailto:et...@stoneleaf.us wrote: On 01/14/2014 12:57 PM, Antoine Pitrou wrote: On Tue, 14 Jan 2014 11:56:25 -0800 Ethan Furman et...@stoneleaf.us mailto:et...@stoneleaf.us wrote: %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through It should try to get a Py_buffer instead. Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and this should be the first thing we try? Sounds good. For that matter, should the first test be does this object support Py_buffer and not worry about it being isinstance(obj, bytes)? Yep. I actually suggest adjusting the %s handling to: - interpolate Py_buffer exporters directly - interpolate __bytes__ if defined - reject anything with an encode method - otherwise interpolate str(obj).encode(ascii) - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) What is the definition of numeric? That is a key question. As suggested above, I would flip the question and explicitly *disallow* implicit encoding of any object with its own encode method, while allowing everything else. Um, int and floats (for example) don't have an .encode method, don't export Py_buffer, don't have a __bytes__ method... Ah! so it would hit the last case, I see. The danger I see with that route is that any ol' object could then make it into the byte stream, and considering what byte streams are for I think we should make the barrier for entry higher than just relying on a __str__ or __repr__. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 15 Jan 2014 08:23, Ethan Furman et...@stoneleaf.us wrote: On 01/14/2014 02:17 PM, Nick Coghlan wrote: On 15 Jan 2014 07:36, Ethan Furman et...@stoneleaf.us mailto: et...@stoneleaf.us wrote: On 01/14/2014 12:57 PM, Antoine Pitrou wrote: On Tue, 14 Jan 2014 11:56:25 -0800 Ethan Furman et...@stoneleaf.us mailto:et...@stoneleaf.us wrote: %s, because it is the most general, has the most convoluted resolution: - input type is bytes? pass it straight through It should try to get a Py_buffer instead. Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and this should be the first thing we try? Sounds good. For that matter, should the first test be does this object support Py_buffer and not worry about it being isinstance(obj, bytes)? Yep. I actually suggest adjusting the %s handling to: - interpolate Py_buffer exporters directly - interpolate __bytes__ if defined - reject anything with an encode method - otherwise interpolate str(obj).encode(ascii) - input type is numeric? use its __xxx__ [1] [2] method and ascii-encode it (strictly) What is the definition of numeric? That is a key question. As suggested above, I would flip the question and explicitly *disallow* implicit encoding of any object with its own encode method, while allowing everything else. Um, int and floats (for example) don't have an .encode method, don't export Py_buffer, don't have a __bytes__ method... Ah! so it would hit the last case, I see. The danger I see with that route is that any ol' object could then make it into the byte stream, and considering what byte streams are for I think we should make the barrier for entry higher than just relying on a __str__ or __repr__. Yeah, reading the other thread pointed out the issues with this idea (containers in particular are a problem). I think Brett has the right idea: we shouldn't try to accept numbers for %s in binary interpolation. If we limit it to just buffer exporters and objects with a __bytes__ method then the problem goes away. The numeric codes all exist in Python 2, so the porting requirement to the common 2/3 subset will be to update the cases of binary interpolation of a number with %s to use an appropriate numeric formatting code instead. Cheers, Nick. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
I think of PEP 460 as the strict version and PEP 461 as the lenient version. I don't think it makes sense to have more variants. So please collaborate with whichever you like best. :-) On Tue, Jan 14, 2014 at 1:11 PM, Ethan Furman et...@stoneleaf.us wrote: On 01/14/2014 01:05 PM, Brett Cannon wrote: On Tue, Jan 14, 2014 at 2:55 PM, Ethan Furman wrote: This PEP goes a but further than PEP 460 does, and hopefully spells things out in enough detail so there is no confusion as to what is meant. Are we going down the PEP route with the various ideas? Guido, do you want one from me as well or should I not bother? While I can't answer for Guido, I will say I authored this PEP because Antoine didn't want 460 to be any more liberal than it already was. If you collect your ideas together, I'll add them to 461 as questions or discussions or however is appropriate (assuming you're willing to go that route). -- ~Ethan~ -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 14/01/2014 19:55, Ethan Furman wrote: This PEP goes a but further than PEP 460 does, and hopefully spells things out in enough detail so there is no confusion as to what is meant. -- ~Ethan~ Out of plain old curiosity do we have to consider PEP 292 string templates in any way, shape or form, or regarding this debate have they been safely booted into touch? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
15.01.14 00:40, Guido van Rossum написав(ла): I think of PEP 460 as the strict version and PEP 461 as the lenient version. I don't think it makes sense to have more variants. So please collaborate with whichever you like best. :-) Perhaps the consensus will be PEP 460.5? Or PEP 460.3, or may be PEP 460.7? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 1/14/2014 2:38 PM, Nick Coghlan wrote: I think Brett has the right idea: we shouldn't try to accept numbers for %s in binary interpolation. If we limit it to just buffer exporters and objects with a __bytes__ method then the problem goes away. The numeric codes all exist in Python 2, so the porting requirement to the common 2/3 subset will be to update the cases of binary interpolation of a number with %s to use an appropriate numeric formatting code instead. +1 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes
On 01/14/2014 05:02 PM, Glenn Linderman wrote: On 1/14/2014 2:38 PM, Nick Coghlan wrote: I think Brett has the right idea: we shouldn't try to accept numbers for %s in binary interpolation. If we limit it to just buffer exporters and objects with a __bytes__ method then the problem goes away. The numeric codes all exist in Python 2, so the porting requirement to the common 2/3 subset will be to update the cases of binary interpolation of a number with %s to use an appropriate numeric formatting code instead. +1 Agreed, PEP updated. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com