subject:"\[Python\-Dev\] PEP 461 \- Adding % and \{\} formatting to bytes"

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Eric Snow

On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com wrote:
 For the first iteration of bytes.format(), I think we should just
 support the exact types of int, float, and bytes. It will call the
 type's__format__ (with the object as self) and encode the result to
 ASCII. For the stated use case of 2.x compatibility, I suspect this will
 cover  90% of the uses in real code. If we find there are cases where
 real code needs additional types supported, we can consider adding
 __format_ascii__ (or whatever name we cook up).

+1

-eric
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Eric Snow

On Thu, Jan 16, 2014 at 3:06 PM, Jan Kaliszewski z...@chopin.edu.pl wrote:
 I'd treat the format()+.__format__()+str.format()-ecosystem as
 a nice text-data-oriented, *complete* Py3k feature, backported to
 Python 2 to share the benefits of the feature with it as well as
 to make the 2-to-3 transition a bit easier.

 IMHO, the PEP-3101's note cited above just describes a workaround
 over the flaws of the Py2's obsolete text model.  Moving such
 complications into Py3k would make the feature (and especially the
 ability to implement your own .__format__()) harder to understand
 and make use of -- for little profit.

 Such a move is not needed for compatibility.  And, IMHO, the
 format()/__format__()/str.format()-matter is all about nice and
 flexible *text* formatting, not about binary data interpolation.

[disclaimer: I personally don't have many use cases for any bytes formatting.]

Yet there is still a strong symmetry between str and bytes that makes
bytes easier to use.  I don't always use formatting, but when I do I
use .format(). :)

never-been-a-fan-of-mod-formatting-ly yours,

-eric
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Nick Coghlan

On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com wrote:

 On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com
wrote:
  For the first iteration of bytes.format(), I think we should just
  support the exact types of int, float, and bytes. It will call the
  type's__format__ (with the object as self) and encode the result to
  ASCII. For the stated use case of 2.x compatibility, I suspect this will
  cover  90% of the uses in real code. If we find there are cases where
  real code needs additional types supported, we can consider adding
  __format_ascii__ (or whatever name we cook up).

 +1

Please don't make me learn the limitations of a new mini language without a
really good reason.

For the sake of argument, assume we have a Python 3.5 with bytes.__mod__
restored roughly as described in PEP 461. *Given* that feature set, what is
the rationale for *adding* bytes.format? What new capabilities will it
provide that aren't already covered by printf-style interpolation directly
to bytes or text formatting followed by encoding the result?

Cheers,
Nick.


 -eric
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Eric V. Smith

On 1/17/2014 6:42 AM, Nick Coghlan wrote:
 
 On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com
 mailto:ericsnowcurren...@gmail.com wrote:

 On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com
 mailto:e...@trueblade.com wrote:
  For the first iteration of bytes.format(), I think we should just
  support the exact types of int, float, and bytes. It will call the
  type's__format__ (with the object as self) and encode the result to
  ASCII. For the stated use case of 2.x compatibility, I suspect this will
  cover  90% of the uses in real code. If we find there are cases where
  real code needs additional types supported, we can consider adding
  __format_ascii__ (or whatever name we cook up).

 +1
 
 Please don't make me learn the limitations of a new mini language
 without a really good reason.
 
 For the sake of argument, assume we have a Python 3.5 with bytes.__mod__
 restored roughly as described in PEP 461. *Given* that feature set, what
 is the rationale for *adding* bytes.format? What new capabilities will
 it provide that aren't already covered by printf-style interpolation
 directly to bytes or text formatting followed by encoding the result?

The only reason to add any of this, in my mind, is to ease porting of
2.x code. If my proposal covers most of the cases of b''.format() that
exist in 2.x code that wants to move to 3.5, then I think it's worth
doing. Is there any such code that's blocked from porting by the lack of
b''.format() that supports bytes, int, and float? I don't know. I
concede that it's unlikely.

IF this were a feature that we were going to add to 3.5 on its own
merits, I think we add __format_ascii__ and make the whole thing
extensible. Is there any new code that's blocked from being written by
missing b.format()? I don't know that, either.

Eric.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Eric V. Smith

On 01/17/2014 07:34 AM, Eric V. Smith wrote:
 On 1/17/2014 6:42 AM, Nick Coghlan wrote:

 On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com
 mailto:ericsnowcurren...@gmail.com wrote:

 On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com
 mailto:e...@trueblade.com wrote:
 For the first iteration of bytes.format(), I think we should just
 support the exact types of int, float, and bytes. It will call the
 type's__format__ (with the object as self) and encode the result to
 ASCII. For the stated use case of 2.x compatibility, I suspect this will
 cover  90% of the uses in real code. If we find there are cases where
 real code needs additional types supported, we can consider adding
 __format_ascii__ (or whatever name we cook up).

 +1

 Please don't make me learn the limitations of a new mini language
 without a really good reason.

 For the sake of argument, assume we have a Python 3.5 with bytes.__mod__
 restored roughly as described in PEP 461. *Given* that feature set, what
 is the rationale for *adding* bytes.format? What new capabilities will
 it provide that aren't already covered by printf-style interpolation
 directly to bytes or text formatting followed by encoding the result?
 
 The only reason to add any of this, in my mind, is to ease porting of
 2.x code. If my proposal covers most of the cases of b''.format() that
 exist in 2.x code that wants to move to 3.5, then I think it's worth
 doing. Is there any such code that's blocked from porting by the lack of
 b''.format() that supports bytes, int, and float? I don't know. I
 concede that it's unlikely.
 
 IF this were a feature that we were going to add to 3.5 on its own
 merits, I think we add __format_ascii__ and make the whole thing
 extensible. Is there any new code that's blocked from being written by
 missing b.format()? I don't know that, either.

Following up, I think this leaves us with 3 choices:

1. Do not implement bytes.format(). We tell any 2.x code that's written
to use str.format() to switch to %-formatting for their common code base.

2. Add the simplistic version of bytes.format() that I describe above,
restricted to accepting bytes, int, and float (and no subclasses). Some
2.x code will work, some will need to change to %-formatting.

3. Add bytes.format() and the __format_ascii__ protocol. We might want
to also add a format_ascii() builtin, to match __format__ and format().
This would require the least change to 2.x code that uses str.format()
and wants to move to bytes.format(), but would require some work on the
3.x side.

I'd advocate 1 or 2.

Eric.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Mark Lawrence


On 17/01/2014 14:50, Eric V. Smith wrote:

On 01/17/2014 07:34 AM, Eric V. Smith wrote:

On 1/17/2014 6:42 AM, Nick Coghlan wrote:


On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com
mailto:ericsnowcurren...@gmail.com wrote:


On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com

mailto:e...@trueblade.com wrote:

For the first iteration of bytes.format(), I think we should just
support the exact types of int, float, and bytes. It will call the
type's__format__ (with the object as self) and encode the result to
ASCII. For the stated use case of 2.x compatibility, I suspect this will
cover  90% of the uses in real code. If we find there are cases where
real code needs additional types supported, we can consider adding
__format_ascii__ (or whatever name we cook up).


+1


Please don't make me learn the limitations of a new mini language
without a really good reason.

For the sake of argument, assume we have a Python 3.5 with bytes.__mod__
restored roughly as described in PEP 461. *Given* that feature set, what
is the rationale for *adding* bytes.format? What new capabilities will
it provide that aren't already covered by printf-style interpolation
directly to bytes or text formatting followed by encoding the result?


The only reason to add any of this, in my mind, is to ease porting of
2.x code. If my proposal covers most of the cases of b''.format() that
exist in 2.x code that wants to move to 3.5, then I think it's worth
doing. Is there any such code that's blocked from porting by the lack of
b''.format() that supports bytes, int, and float? I don't know. I
concede that it's unlikely.

IF this were a feature that we were going to add to 3.5 on its own
merits, I think we add __format_ascii__ and make the whole thing
extensible. Is there any new code that's blocked from being written by
missing b.format()? I don't know that, either.


Following up, I think this leaves us with 3 choices:

1. Do not implement bytes.format(). We tell any 2.x code that's written
to use str.format() to switch to %-formatting for their common code base.

2. Add the simplistic version of bytes.format() that I describe above,
restricted to accepting bytes, int, and float (and no subclasses). Some
2.x code will work, some will need to change to %-formatting.

3. Add bytes.format() and the __format_ascii__ protocol. We might want
to also add a format_ascii() builtin, to match __format__ and format().
This would require the least change to 2.x code that uses str.format()
and wants to move to bytes.format(), but would require some work on the
3.x side.

I'd advocate 1 or 2.

Eric.



For both options 1 and 2 surely you cannot be suggesting that after 
people have written 2.x code to use format() as %f formatting is to be 
deprecated, they now have to change the code back to the way they may 
well have written it in the first place?


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Eric V. Smith

On 01/17/2014 10:15 AM, Mark Lawrence wrote:
 On 17/01/2014 14:50, Eric V. Smith wrote:
 On 01/17/2014 07:34 AM, Eric V. Smith wrote:
 On 1/17/2014 6:42 AM, Nick Coghlan wrote:

 On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com
 mailto:ericsnowcurren...@gmail.com wrote:

 On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com
 mailto:e...@trueblade.com wrote:
 For the first iteration of bytes.format(), I think we should just
 support the exact types of int, float, and bytes. It will call the
 type's__format__ (with the object as self) and encode the result to
 ASCII. For the stated use case of 2.x compatibility, I suspect
 this will
 cover  90% of the uses in real code. If we find there are cases
 where
 real code needs additional types supported, we can consider adding
 __format_ascii__ (or whatever name we cook up).

 +1

 Please don't make me learn the limitations of a new mini language
 without a really good reason.

 For the sake of argument, assume we have a Python 3.5 with
 bytes.__mod__
 restored roughly as described in PEP 461. *Given* that feature set,
 what
 is the rationale for *adding* bytes.format? What new capabilities will
 it provide that aren't already covered by printf-style interpolation
 directly to bytes or text formatting followed by encoding the result?

 The only reason to add any of this, in my mind, is to ease porting of
 2.x code. If my proposal covers most of the cases of b''.format() that
 exist in 2.x code that wants to move to 3.5, then I think it's worth
 doing. Is there any such code that's blocked from porting by the lack of
 b''.format() that supports bytes, int, and float? I don't know. I
 concede that it's unlikely.

 IF this were a feature that we were going to add to 3.5 on its own
 merits, I think we add __format_ascii__ and make the whole thing
 extensible. Is there any new code that's blocked from being written by
 missing b.format()? I don't know that, either.

 Following up, I think this leaves us with 3 choices:

 1. Do not implement bytes.format(). We tell any 2.x code that's written
 to use str.format() to switch to %-formatting for their common code base.

 2. Add the simplistic version of bytes.format() that I describe above,
 restricted to accepting bytes, int, and float (and no subclasses). Some
 2.x code will work, some will need to change to %-formatting.

 3. Add bytes.format() and the __format_ascii__ protocol. We might want
 to also add a format_ascii() builtin, to match __format__ and format().
 This would require the least change to 2.x code that uses str.format()
 and wants to move to bytes.format(), but would require some work on the
 3.x side.

 I'd advocate 1 or 2.

 Eric.

 
 For both options 1 and 2 surely you cannot be suggesting that after
 people have written 2.x code to use format() as %f formatting is to be
 deprecated, they now have to change the code back to the way they may
 well have written it in the first place?
 

That would be part of it, yes. Otherwise you need #3.

This is all assuming we've ruled out an option 4, because of the
exceptions raised depending on what __format__ does:

4. Add bytes.format(), have it convert the format specifier to str
(unicode), call __format__ and encode the result back to ASCII. Accept
that there will be data-driven exceptions depending on the result of the
__format__ call.

I'm open to other ideas.

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Ethan Furman


On 01/17/2014 07:15 AM, Mark Lawrence wrote:


For both options 1 and 2 surely you cannot be suggesting that
 after people have written 2.x code to use format() as %f
formatting is to be deprecated


%f formatting is not deprecated, and will not be in 3.x's lifetime.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Eric V. Smith

On 01/17/2014 10:24 AM, Eric V. Smith wrote:
 On 01/17/2014 10:15 AM, Mark Lawrence wrote:
 On 17/01/2014 14:50, Eric V. Smith wrote:
 On 01/17/2014 07:34 AM, Eric V. Smith wrote:
 On 1/17/2014 6:42 AM, Nick Coghlan wrote:

 On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com
 mailto:ericsnowcurren...@gmail.com wrote:

 On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com
 mailto:e...@trueblade.com wrote:
 For the first iteration of bytes.format(), I think we should just
 support the exact types of int, float, and bytes. It will call the
 type's__format__ (with the object as self) and encode the result to
 ASCII. For the stated use case of 2.x compatibility, I suspect
 this will
 cover  90% of the uses in real code. If we find there are cases
 where
 real code needs additional types supported, we can consider adding
 __format_ascii__ (or whatever name we cook up).

 +1

 Please don't make me learn the limitations of a new mini language
 without a really good reason.

 For the sake of argument, assume we have a Python 3.5 with
 bytes.__mod__
 restored roughly as described in PEP 461. *Given* that feature set,
 what
 is the rationale for *adding* bytes.format? What new capabilities will
 it provide that aren't already covered by printf-style interpolation
 directly to bytes or text formatting followed by encoding the result?

 The only reason to add any of this, in my mind, is to ease porting of
 2.x code. If my proposal covers most of the cases of b''.format() that
 exist in 2.x code that wants to move to 3.5, then I think it's worth
 doing. Is there any such code that's blocked from porting by the lack of
 b''.format() that supports bytes, int, and float? I don't know. I
 concede that it's unlikely.

 IF this were a feature that we were going to add to 3.5 on its own
 merits, I think we add __format_ascii__ and make the whole thing
 extensible. Is there any new code that's blocked from being written by
 missing b.format()? I don't know that, either.

 Following up, I think this leaves us with 3 choices:

 1. Do not implement bytes.format(). We tell any 2.x code that's written
 to use str.format() to switch to %-formatting for their common code base.

 2. Add the simplistic version of bytes.format() that I describe above,
 restricted to accepting bytes, int, and float (and no subclasses). Some
 2.x code will work, some will need to change to %-formatting.

 3. Add bytes.format() and the __format_ascii__ protocol. We might want
 to also add a format_ascii() builtin, to match __format__ and format().
 This would require the least change to 2.x code that uses str.format()
 and wants to move to bytes.format(), but would require some work on the
 3.x side.

For #3, hopefully this additional work on the 3.x side would just be
to add, to each class where you already have a custom __format__ used
for b''.format(), code like:

def __format_ascii__(self, fmt):
return self.__format__(fmt.decode()).encode('ascii')

That is, we're pushing the possibility of having to deal with an
encoding exception off to the type, instead of having it live in
bytes.format().

And to agree with Ethan: %-formatting isn't deprecated.

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Mark Lawrence


On 17/01/2014 15:41, Ethan Furman wrote:

On 01/17/2014 07:15 AM, Mark Lawrence wrote:


For both options 1 and 2 surely you cannot be suggesting that
 after people have written 2.x code to use format() as %f
formatting is to be deprecated


%f formatting is not deprecated, and will not be in 3.x's lifetime.

--
~Ethan~


I'm sorry, I got the above wrong, I should have said was to be 
deprecated :(


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Brett Cannon

On Fri, Jan 17, 2014 at 9:50 AM, Eric V. Smith e...@trueblade.com wrote:

 On 01/17/2014 07:34 AM, Eric V. Smith wrote:
  On 1/17/2014 6:42 AM, Nick Coghlan wrote:
 
  On 17 Jan 2014 18:03, Eric Snow ericsnowcurren...@gmail.com
  mailto:ericsnowcurren...@gmail.com wrote:
 
  On Thu, Jan 16, 2014 at 11:30 AM, Eric V. Smith e...@trueblade.com
  mailto:e...@trueblade.com wrote:
  For the first iteration of bytes.format(), I think we should just
  support the exact types of int, float, and bytes. It will call the
  type's__format__ (with the object as self) and encode the result to
  ASCII. For the stated use case of 2.x compatibility, I suspect this
 will
  cover  90% of the uses in real code. If we find there are cases where
  real code needs additional types supported, we can consider adding
  __format_ascii__ (or whatever name we cook up).
 
  +1
 
  Please don't make me learn the limitations of a new mini language
  without a really good reason.
 
  For the sake of argument, assume we have a Python 3.5 with bytes.__mod__
  restored roughly as described in PEP 461. *Given* that feature set, what
  is the rationale for *adding* bytes.format? What new capabilities will
  it provide that aren't already covered by printf-style interpolation
  directly to bytes or text formatting followed by encoding the result?
 
  The only reason to add any of this, in my mind, is to ease porting of
  2.x code. If my proposal covers most of the cases of b''.format() that
  exist in 2.x code that wants to move to 3.5, then I think it's worth
  doing. Is there any such code that's blocked from porting by the lack of
  b''.format() that supports bytes, int, and float? I don't know. I
  concede that it's unlikely.
 
  IF this were a feature that we were going to add to 3.5 on its own
  merits, I think we add __format_ascii__ and make the whole thing
  extensible. Is there any new code that's blocked from being written by
  missing b.format()? I don't know that, either.

 Following up, I think this leaves us with 3 choices:

 1. Do not implement bytes.format(). We tell any 2.x code that's written
 to use str.format() to switch to %-formatting for their common code base.


+1

I would rephrase it to switch to %-formatting for bytes usage for their
common code base. If they are working with actual text then using
str.format() still works (and is actually nicer to use IMO). It actually
might make the str/bytes relationship even clearer, especially if we start
to promote that str.format() is for text and %-formatting is for bytes.



 2. Add the simplistic version of bytes.format() that I describe above,
 restricted to accepting bytes, int, and float (and no subclasses). Some
 2.x code will work, some will need to change to %-formatting.


-1

I am still not comfortable with the special-casing by type for
bytes.format().



 3. Add bytes.format() and the __format_ascii__ protocol. We might want
 to also add a format_ascii() builtin, to match __format__ and format().
 This would require the least change to 2.x code that uses str.format()
 and wants to move to bytes.format(), but would require some work on the
 3.x side.


+0

Would allow for easy porting and it's general enough, but I don't know if
working with bytes really requires this much beyond supporting the porting
story.

I'm still +1 on PEP 460 for bytes.format() as a nice way to simplify basic
bytes usage in Python 3, but if that's not accepted then I say just drop
bytes.format() entirely and let %-formatting be the way people do Python
2/3 bytes work (if they are not willing to build it up from scratch like
they already can do).

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Paul Moore

On 17 January 2014 15:50, Eric V. Smith e...@trueblade.com wrote:
 For #3, hopefully this additional work on the 3.x side would just be
 to add, to each class where you already have a custom __format__ used
 for b''.format(), code like:

 def __format_ascii__(self, fmt):
 return self.__format__(fmt.decode()).encode('ascii')

For me, the big cost would seem to be in the necessary documentation,
explaining the new special method in the language reference,
explaining the 2 different forms of format() in the built in types
docs. And the conceptual overhead of another special method for people
to be aware of. If I implement my own number subclass, do I need to
implement __format_ascii__?

My gut feeling is that we simply don't implement format() for bytes. I
don't see sufficient benefit, if %-formatting is available.

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Barry Warsaw

On Jan 17, 2014, at 11:00 AM, Brett Cannon wrote:

I would rephrase it to switch to %-formatting for bytes usage for their
common code base.

-1.  %-formatting is so neanderthal. :)

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Nick Coghlan

On 18 Jan 2014 02:08, Paul Moore p.f.mo...@gmail.com wrote:

 On 17 January 2014 15:50, Eric V. Smith e...@trueblade.com wrote:
  For #3, hopefully this additional work on the 3.x side would just be
  to add, to each class where you already have a custom __format__ used
  for b''.format(), code like:
 
  def __format_ascii__(self, fmt):
  return self.__format__(fmt.decode()).encode('ascii')

 For me, the big cost would seem to be in the necessary documentation,
 explaining the new special method in the language reference,
 explaining the 2 different forms of format() in the built in types
 docs. And the conceptual overhead of another special method for people
 to be aware of. If I implement my own number subclass, do I need to
 implement __format_ascii__?

 My gut feeling is that we simply don't implement format() for bytes. I
 don't see sufficient benefit, if %-formatting is available.

Exactly, it's the documentation problem to explain when would I recommend
using this over the alternatives?  that turns me off the idea of general
purpose bytes formatting. printf style covers the use cases we have
identified, and the code bases of immediate interest support 2.5 or earlier
and thus *must* be using printf-style formatting.

Add to that the fact that to maintain the Python 3 text model, we either
have to gut it to the point where it has very few of the benefits the text
version offers printf-style formatting, or else we introduce a whole new
protocol for a feature that we consider so borderline that it took us six
Python 3 releases to add it back to the language.

By contrast, the following model is relatively easy to document:

* printf-style is low level and relatively inflexible, but available for
both text and for ASCII compatible segments in binary data. The %s
formatting code accepts arbitrary objects (using str) in text mode, but
only buffer exporters and objects with a __bytes__ method in binary mode.

* the format is high level and very flexible, but available only for text -
the result must be explicitly encoded to binary if that is needed.

Cheers,
Nick.


 Paul.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Ethan Furman


On 01/16/2014 11:47 PM, Steven D'Aprano wrote:

On Thu, Jan 16, 2014 at 08:23:13AM -0800, Ethan Furman wrote:


As I understand it, str.format will call the object's __format__.  So, for
example, if I say:

   u'the value is: %d' % myNum(17)

then it will be myNum.__format__ that gets called, not int.__format__;


I seem to have missed something, because I am completely confused... Why
are you talking about str.format and then show an example using % instead?


Sorry, PEP 46x fatigue.  :/

It should have been

u'the value is {:d}'.format(myNum(17))

and yes I meant the str type.



%d calls __str__, not __format__. This is in Python 3.3:

py class MyNum(int):
... def __str__(self):
... print(Calling MyNum.__str__)
... return super().__str__()
... def __format__(self):
... print(Calling MyNum.__format__)
... return super().__format__()
...
py n = MyNum(17)
py u%d % n
Calling MyNum.__str__
'17'


And that's a bug we fixed in 3.4:

Python 3.4.0b1 (default:172a6bfdd91b+, Jan  5 2014, 06:39:32)
[GCC 4.7.3] on linux
Type help, copyright, credits or license for more information.

-- class myNum(int):
...   def __int__(self):
... return 7
...   def __index__(self):
... return 11
...   def __float__(self):
... return 13.81727
...   def __str__(self):
... print('__str__')
... return '1'
...   def __repr__(self):
... print('__repr__')
... return '2'
...
-- '%d' % myNum()
'0'
-- '%f' % myNum()
'13.817270'


After all, consider:


'%d' % True

'1'

'%s' % True

'True'

So, in fact, on subclasses __str__ should *not* be called to get the integer representation.  First we do a conversion 
to make sure we have an int (or float, or ...), and then we call __str__ on our tried and trusted genuine core type.




The *worst* solution would be to completely ignore MyNum.__str__.
That's a nasty violation of the Principle Of Least Surprise, and will
lead to confusion (why isn't my class' __str__ method being called?)


Because you asked for a numeric representation, not a string representation [1].

--
~Ethan~


[1] for all the gory details, see:
http://bugs.python.org/issue18780
http://bugs.python.org/issue18738
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Brett Cannon

On Fri, Jan 17, 2014 at 11:16 AM, Barry Warsaw ba...@python.org wrote:

 On Jan 17, 2014, at 11:00 AM, Brett Cannon wrote:

 I would rephrase it to switch to %-formatting for bytes usage for their
 common code base.

 -1.  %-formatting is so neanderthal. :)


Very much so, which is why I'm willing to let it be bastardized in Python
3.5 for the sake of porting but not bytes.format(). =) I'm keeping format()
clean for my nieces and nephew to use; they can just turn their nose up at
%-formatting when they are old enough to program.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Eric V. Smith

On 01/17/2014 11:58 AM, Brett Cannon wrote:
 
 
 
 On Fri, Jan 17, 2014 at 11:16 AM, Barry Warsaw ba...@python.org
 mailto:ba...@python.org wrote:
 
 On Jan 17, 2014, at 11:00 AM, Brett Cannon wrote:
 
 I would rephrase it to switch to %-formatting for bytes usage for
 their
 common code base.
 
 -1.  %-formatting is so neanderthal. :)
 
 
 Very much so, which is why I'm willing to let it be bastardized in
 Python 3.5 for the sake of porting but not bytes.format(). =) I'm
 keeping format() clean for my nieces and nephew to use; they can just
 turn their nose up at %-formatting when they are old enough to program.

Given the problems with implementing it, I'm more than willing to drop
bytes.format() from PEP 461 (not that it's my PEP). But if we think that
%-formatting is neanderthal and will get dropped in the Python 4000
timeframe (that is, someday in the far future), then I think we should
have some advice to give to people who are writing new 3.x code for the
non-porting use-cases addressed by the PEP. I'm specifically thinking of
new code that wants to format some bytes for an on-the-wire ascii-like
protocol.

Is it:
  b'Content-Length: ' + str(47).encode('ascii')
or
  b'Content-Length: {}.format(str(47).encode('ascii'))
or something better?

I think it will look like the above, or involve something like
bytes.format() and __format_ascii__. Or, maybe a library that just
supports a few types (say, bytes, int, and float!).

Eric.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Ethan Furman


On 01/17/2014 09:13 AM, Eric V. Smith wrote:

On 01/17/2014 11:58 AM, Brett Cannon wrote:

On Fri, Jan 17, 2014 at 11:16 AM, Barry Warsaw wrote:

On Jan 17, 2014, at 11:00 AM, Brett Cannon wrote:


I would rephrase it to switch to %-formatting for bytes usage for
their common code base.


-1.  %-formatting is so neanderthal. :)


Very much so, which is why I'm willing to let it be bastardized in
Python 3.5 for the sake of porting but not bytes.format(). =) I'm
keeping format() clean for my nieces and nephew to use; they can just
turn their nose up at %-formatting when they are old enough to program.


Given the problems with implementing it, I'm more than willing to drop
bytes.format() from PEP 461 (not that it's my PEP). But if we think that
%-formatting is neanderthal and will get dropped in the Python 4000
timeframe


I hope not!


 (that is, someday in the far future), then I think we should
have some advice to give to people who are writing new 3.x code for the
non-porting use-cases addressed by the PEP. I'm specifically thinking of
new code that wants to format some bytes for an on-the-wire ascii-like
protocol.


%-interpolation handles this use case well, format does not.


Is it:
   b'Content-Length: ' + str(47).encode('ascii')
or
   b'Content-Length: {}.format(str(47).encode('ascii'))
or something better?


Ew.  Neither of those look better than

b'Content-Length: %d' % 47

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Glenn Linderman


On 1/17/2014 6:50 AM, Eric V. Smith wrote:

Following up, I think this leaves us with 3 choices:

1. Do not implement bytes.format(). We tell any 2.x code that's written
to use str.format() to switch to %-formatting for their common code base.

2. Add the simplistic version of bytes.format() that I describe above,
restricted to accepting bytes, int, and float (and no subclasses). Some
2.x code will work, some will need to change to %-formatting.

3. Add bytes.format() and the __format_ascii__ protocol. We might want
to also add a format_ascii() builtin, to match __format__ and format().
This would require the least change to 2.x code that uses str.format()
and wants to move to bytes.format(), but would require some work on the
3.x side.

I'd advocate 1 or 2.


Nice summary.

I'd advocate 1 or 3.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Glenn Linderman


On 1/17/2014 7:15 AM, Mark Lawrence wrote:
For both options 1 and 2 surely you cannot be suggesting that after 
people have written 2.x code to use format() as %f formatting is to be 
deprecated, they now have to change the code back to the way they may 
well have written it in the first place?


If they are committed to format(), another option is to operate in the 
Unicode domain, and encode at the end.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Eric V. Smith

On 01/17/2014 02:04 PM, Glenn Linderman wrote:
 On 1/17/2014 7:15 AM, Mark Lawrence wrote:
 For both options 1 and 2 surely you cannot be suggesting that after
 people have written 2.x code to use format() as %f formatting is to be
 deprecated, they now have to change the code back to the way they may
 well have written it in the first place?
 
 If they are committed to format(), another option is to operate in the
 Unicode domain, and encode at the end.

Maybe that's the best advice to give. It's better than my earlier
example of field-at-a-time encoding.

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Terry Reedy


On 1/17/2014 10:15 AM, Mark Lawrence wrote:


For both options 1 and 2 surely you cannot be suggesting that after
people have written 2.x code to use format() as %f formatting is to be
deprecated,


I will not be for at least a decade.


they now have to change the code back to the way they may
well have written it in the first place?


I would suggest that people simply .encode the result if bytes are 
needed in 3.x as well as 2.x. Polyglot code will likely have a 'py3' 
boolean already to make the encoding conditional.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Terry Reedy


Responding to two posts at once, as I consider them

On 1/17/2014 11:00 AM, Brett Cannon wrote:


I would rephrase it to switch to %-formatting for bytes usage for their
common code base. If they are working with actual text then using
str.format() still works (and is actually nicer to use IMO). It actually
might make the str/bytes relationship even clearer, especially if we
start to promote that str.format() is for text and %-formatting is for
bytes.


Good idea, I think: printf % formatting was invented for formatting 
ascii text in bytestrings as it was being output (although sprintf 
allowed not-output). In retrospect, I think we should have introduced 
unicode.format when unicode was introduced in 2.0 and perhap never have 
had unicode % formatting. Or we should have dropped str % instead of 
bytes % in 3.0.


On 1/17/2014 12:13 PM, Eric V. Smith wrote:
 But if we think that %-formatting is neanderthal and will get dropped 
 in the Python 4000 timeframe (that is, someday in the far future),


Some people, such as Martin Loewis, have a different opinion of 
%-formatting and will fight deprecating it *ever*. (I suspect that 
%-format opinions are influenced by one's current relation to C.)


 then I think we should have some advice to give to people who are
 writing new 3.x code for the non-porting use-cases addressed by the
 PEP. I'm specifically thinking of new code that wants to format some 
 bytes for an on-the-wire ascii-like protocol.


If we add %-formatting back in 3.5 for its original purpose, formatting 
ascii in bytes for output, I think we should drop the idea of later 
deprecating it (a few releases later) for that purpose. I think the PEP 
should even say so, that bytes % will remain indefinitely even if str % 
were to be dropped in favor of str.format.


I would consider dropping unicode(now string).__mod__ in favor of 
.format to still be an eventual option, especially if someone were to 
write a converter.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-17 Thread Nick Coghlan

On 18 Jan 2014 06:19, Terry Reedy tjre...@udel.edu wrote:

 On 1/17/2014 10:15 AM, Mark Lawrence wrote:

 For both options 1 and 2 surely you cannot be suggesting that after
 people have written 2.x code to use format() as %f formatting is to be
 deprecated,


 I will not be for at least a decade.

It will not be deprecated, period. Originally, we thought that the
introduction of the new flexible text formatting system made printf-style
formatting redundant.

After running both in parallel for a while, we learned we were wrong:

- it's far more difficult than we originally anticipated to migrate away
from it to the new text formatting system
- in particular, the lazy interpolation support in the logging module (and
similar systems) has no reasonable migration path
- two different core interpolation systems make it much easier to
interpolate into format strings
- it's a better fit for code which needs to semantically align with C
- it's a useful micro-optimisation
- as the current discussion shows, it's much better suited to the
interpolation of ASCII compatible segments in binary data formats

Do many of the core devs strongly prefer the new formatting system? Yes.
Were we originally planning to deprecate and remove the printf-style
formatting system? Yes. Are there still any plans to do so? No. That's why
we rewrote the relevant docs to always describe it as mod formatting or
printf-style formatting, rather than legacy or old-style. If there
are any instances (or even implications) of the latter left in the official
docs, that's a bug to be fixed.

Perhaps this needs to be a new Q in my Python 3 QA, since a lot of people
still seem to have the wrong idea...

Regards,
Nick.



 they now have to change the code back to the way they may
 well have written it in the first place?


 I would suggest that people simply .encode the result if bytes are needed
in 3.x as well as 2.x. Polyglot code will likely have a 'py3' boolean
already to make the encoding conditional.

 --
 Terry Jan Reedy


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Nick Coghlan

On 16 Jan 2014 17:53, Ethan Furman et...@stoneleaf.us wrote:

 On 01/15/2014 06:45 AM, Brett Cannon wrote:


 This is why I have argued that if you specify it as if there is a
format spec specified, then the return value from
 calling __format__() will have str.decode('ascii', 'strict') called on
it you get the support for the various
 number-specific format specs for free.


 It may work like this under the hood, but it's an implementation detail.
 Since the numeric format codes will call int, index, or float on the
object (to handle subclasses), we could then call __format__ on the
resulting int or float to do the heavy lifting; but since __format__ on
anything else would never be called I don't want to give that impression.

I have a different proposal: let's *just* add mod formatting to bytes, and
leave the extensible formatting system as a text only operation.

We don't really care if bytes supports that method for version
compatibility purposes, and the deliberate flexibility of the design makes
it hard to translate into the binary domain.

So let's just not provide that - let's accept that, for the binary domain,
printf style formatting is just a better fit for the job :)

Cheers,
Nick.



 It also means if you pass in a string that you just want the strict
ASCII bytes
 of then you can get it with {:s}.


 This isn't going to happen.  If the user wants a string to be in the byte
stream, it has to either be a bytes literal or explicitly encoded [1].

 --
 ~Ethan~

 [1] Apologies if this has already been answered.  I wanted to make sure I
responded to all the ideas/objects, and I may have responded more than once
to some.  It's been a long few threads.  ;)

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Greg Ewing


Nick Coghlan wrote:


I have a different proposal: let's *just* add mod formatting to bytes, 
and leave the extensible formatting system as a text only operation.


+1

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Brett Cannon

On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 01/15/2014 06:45 AM, Brett Cannon wrote:


 This is why I have argued that if you specify it as if there is a format
 spec specified, then the return value from
 calling __format__() will have str.decode('ascii', 'strict') called on
 it you get the support for the various
 number-specific format specs for free.


 It may work like this under the hood, but it's an implementation detail.


I'm arguing it's not an implementation detail but a definition of how
bytes.format() would work.


  Since the numeric format codes will call int, index, or float on the
 object (to handle subclasses),


But that's **only** because the numeric types choose to as part of their
__format__() implementation; it is not inherent to str.format().


 we could then call __format__ on the resulting int or float to do the
 heavy lifting;


It's not just the heavy lifting; it does **all** the lifting for format
specifications.


 but since __format__ on anything else would never be called I don't want
 to give that impression.


Fine, if you're worried about bytes.format() overstepping by implicitly
calling str.encode() on the return value of __format__() then you will need
__bytes__format__() to get equivalent support.

-Brett



  It also means if you pass in a string that you just want the strict ASCII
 bytes
 of then you can get it with {:s}.


 This isn't going to happen.  If the user wants a string to be in the byte
 stream, it has to either be a bytes literal or explicitly encoded [1].

 --
 ~Ethan~

 [1] Apologies if this has already been answered.  I wanted to make sure I
 responded to all the ideas/objects, and I may have responded more than once
 to some.  It's been a long few threads.  ;)

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 brett%40python.org

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Brett Cannon

On Thu, Jan 16, 2014 at 4:56 AM, Nick Coghlan ncogh...@gmail.com wrote:


 On 16 Jan 2014 17:53, Ethan Furman et...@stoneleaf.us wrote:
 
  On 01/15/2014 06:45 AM, Brett Cannon wrote:
 
 
  This is why I have argued that if you specify it as if there is a
 format spec specified, then the return value from
  calling __format__() will have str.decode('ascii', 'strict') called on
 it you get the support for the various
  number-specific format specs for free.
 
 
  It may work like this under the hood, but it's an implementation detail.
  Since the numeric format codes will call int, index, or float on the
 object (to handle subclasses), we could then call __format__ on the
 resulting int or float to do the heavy lifting; but since __format__ on
 anything else would never be called I don't want to give that impression.

 I have a different proposal: let's *just* add mod formatting to bytes, and
 leave the extensible formatting system as a text only operation.

 We don't really care if bytes supports that method for version
 compatibility purposes, and the deliberate flexibility of the design makes
 it hard to translate into the binary domain.

 So let's just not provide that - let's accept that, for the binary domain,
 printf style formatting is just a better fit for the job :)


Or PEP 460 for bytes.format() and PEP 461 for %.

-Brett


 Cheers,
 Nick.

 
 
  It also means if you pass in a string that you just want the strict
 ASCII bytes
  of then you can get it with {:s}.
 
 
  This isn't going to happen.  If the user wants a string to be in the
 byte stream, it has to either be a bytes literal or explicitly encoded [1].
 
  --
  ~Ethan~
 
  [1] Apologies if this has already been answered.  I wanted to make sure
 I responded to all the ideas/objects, and I may have responded more than
 once to some.  It's been a long few threads.  ;)
 
  ___
  Python-Dev mailing list
  Python-Dev@python.org
  https://mail.python.org/mailman/listinfo/python-dev
  Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/brett%40python.org


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Neil Schemenauer

Greg Ewing greg.ew...@canterbury.ac.nz wrote:
 Neil Schemenauer wrote:
 Objects that implement __str__ can also implement __bytes__ if they
 can guarantee that ASCII characters are always returned,

 I think __ascii_ would be a better name. I'd expect
 a method called __bytes__ on an int to return some
 version of its binary value.

I realize now we can't use __bytes__.  Currently, passing an int
to bytes() causes it to construct an object with that many null
bytes.

If we are going to support format() (I'm not convinced it is nessary
and could easily be added in a later version), then we need an
equivalent to __format__.  My vote is either:

def __formatascii__(self, spec):
...

or

def __ascii__(self, spec):
...

Previously I was thinking of __bformat__ or __formatb__ but having
ascii in the method name is a great reminder.

Objects with a natural arbitrary byte representation can implement
__bytes__ and %s should use that if it exists.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Michael Urman

On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon br...@python.org wrote:
 Fine, if you're worried about bytes.format() overstepping by implicitly
 calling str.encode() on the return value of __format__() then you will need
 __bytes__format__() to get equivalent support.

Could we just re-use PEP-3101's note (easily updated for Python 3):

Note for Python 2.x: The 'format_spec' argument will be either
a string object or a unicode object, depending on the type of the
original format string.  The __format__ method should test the type
of the specifiers parameter to determine whether to return a string or
unicode object.  It is the responsibility of the __format__ method
to return an object of the proper type.

If __format__ receives a format_spec of type bytes, it should return
bytes. For such cases on objects that cannot support bytes (i.e. for
str), it can raise. This appears to avoid the need for additional
methods. (As does Nick's proposal of leaving it out for now.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Brett Cannon

On Thu, Jan 16, 2014 at 11:33 AM, Michael Urman mur...@gmail.com wrote:

 On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon br...@python.org wrote:
  Fine, if you're worried about bytes.format() overstepping by implicitly
  calling str.encode() on the return value of __format__() then you will
 need
  __bytes__format__() to get equivalent support.

 Could we just re-use PEP-3101's note (easily updated for Python 3):

 Note for Python 2.x: The 'format_spec' argument will be either
 a string object or a unicode object, depending on the type of the
 original format string.  The __format__ method should test the type
 of the specifiers parameter to determine whether to return a string or
 unicode object.  It is the responsibility of the __format__ method
 to return an object of the proper type.

 If __format__ receives a format_spec of type bytes, it should return
 bytes. For such cases on objects that cannot support bytes (i.e. for
 str), it can raise. This appears to avoid the need for additional
 methods. (As does Nick's proposal of leaving it out for now.)


That's a very good catch, Michael! I think that makes sense if there is
precedence. Unfortunately that bit from the PEP never made it into the
documentation so I'm not sure if there is a backwards-compatibility worry.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Ethan Furman


On 01/16/2014 06:45 AM, Brett Cannon wrote:

On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman wrote:

On 01/15/2014 06:45 AM, Brett Cannon wrote:


This is why I have argued that if you specify it as
 if there is a format spec specified, then the return
value from calling __format__() will have
 str.decode('ascii', 'strict') called on it you get
the support for the various number-specific format
 specs for free.



Since the numeric format codes will call int, index,
 or float on the object (to handle subclasses),


But that's **only** because the numeric types choose
 to as part of their __format__() implementation; it is
not inherent to str.format().


As I understand it, str.format will call the object's __format__.  So, for 
example, if I say:

  u'the value is: %d' % myNum(17)

then it will be myNum.__format__ that gets called, not int.__format__; this is precisely what we don't want, since can't 
know that myNum is only going to return ASCII characters.


This is why I would have bytes.__format__, as part of its parsing, call int, index, or float depending on the format 
code; so the above example would have bytes.__format__ calling int() on myNum(17), at which point we either have an int 
type or an exception was raised because myNum isn't really an integer.  Once we have an int, whose format we know and 
trust, then we can call its __format__ and proceed from there.


On the flip side, if myNum does define it's own __format__, it will not be called by bytes.format, and perhaps that is 
another good reason for bytes to only support %-interpolation and not format?


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Neil Schemenauer

Michael Urman mur...@gmail.com wrote:
 If __format__ receives a format_spec of type bytes, it should return
 bytes. For such cases on objects that cannot support bytes (i.e. for
 str), it can raise. This appears to avoid the need for additional
 methods. (As does Nick's proposal of leaving it out for now.)

That's an interesting idea.  I proposed __ascii__ as a analogous
method to __format__ for bytes formatting and to have
%-interpolation use it.  However, overloading __format__ based on
the type of the argument could work.

I see with Python 3:

 (1).__format__(b'')
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: must be str, not bytes

A TypeError exception is what we want if the object does not support
bytes formatting.  Some possible problems:

- It could be hard to provide a helpful exception message since it
  is generated inside the __format__ method rather than inside the
  bytes.__mod__ method (in the case of a missing __ascii__ method).
  The most common error will be using a str object and so we could
  modify the __format__ method of str to provide a nice hint (use
  encode()).

- Is there some risk that an object will unwittingly implement a
  __format__ method that unintentionally accepts a bytes argument?
  That requires some investigation.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Michael Urman

On Thu, Jan 16, 2014 at 11:13 AM, Neil Schemenauer n...@arctrix.com wrote:
 A TypeError exception is what we want if the object does not support
 bytes formatting.  Some possible problems:

 - It could be hard to provide a helpful exception message since it
   is generated inside the __format__ method rather than inside the
   bytes.__mod__ method (in the case of a missing __ascii__ method).
   The most common error will be using a str object and so we could
   modify the __format__ method of str to provide a nice hint (use
   encode()).

The various format functions could certainly intercept and wrap
exceptions raised by __format__ methods. Once the core types were
modified to expect bytes in format_spec, however, this may not be
critical; __format__ methods which delegate would work as expected,
str could certainly be clear about why it raised, and custom
implementations would be handled per comments I'll make on your second
point. Overall I suspect this is no worse than unhandled values in the
format_spec are today.

 - Is there some risk that an object will unwittingly implement a
   __format__ method that unintentionally accepts a bytes argument?
   That requires some investigation.

Agreed. Some quick armchair calculations suggest to me that there are
three likely outcomes:
 - Properly handle the type (perhaps written with the 2.x clause in mind)
 - Raise an exception internally (perhaps ValueError, such as from
format(3, 'q'))
 - Mishandle and return a str (perhaps due to to if/else defaulting)
The first and second outcome may well reflect what we want, and the
third could easily be detected and turned into an exception by the
format functions.

I'm uncertain whether this reflects all the scenarios we would care about.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Glenn Linderman


On 1/16/2014 8:41 AM, Brett Cannon wrote:
That's a very good catch, Michael! I think that makes sense if there 
is precedence. Unfortunately that bit from the PEP never made it into 
the documentation so I'm not sure if there is a 
backwards-compatibility worry.


No.  If __format__ is called with bytes format, and returns str, there 
would be an exception generated on the spot.


If __format__ is called with bytes format, and tries to use it as str, 
there would be an exception generated on the spot.


Prior to 3.whenever-this-is-implemented, Python 3 only provides str 
formats to __format__, right? So new code is required to pass bytes to 
__format__.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Eric V. Smith

On 01/16/2014 11:23 AM, Ethan Furman wrote:
 On 01/16/2014 06:45 AM, Brett Cannon wrote:
 On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman wrote:
 On 01/15/2014 06:45 AM, Brett Cannon wrote:

 This is why I have argued that if you specify it as
  if there is a format spec specified, then the return
 value from calling __format__() will have
  str.decode('ascii', 'strict') called on it you get
 the support for the various number-specific format
  specs for free.
 
 Since the numeric format codes will call int, index,
  or float on the object (to handle subclasses),

 But that's **only** because the numeric types choose
  to as part of their __format__() implementation; it is
 not inherent to str.format().
 
 As I understand it, str.format will call the object's __format__.  So,
 for example, if I say:
 
   u'the value is: %d' % myNum(17)
 
 then it will be myNum.__format__ that gets called, not int.__format__;
 this is precisely what we don't want, since can't know that myNum is
 only going to return ASCII characters.

Magic methods, including __format__, are called on the type, not the
instance.

 This is why I would have bytes.__format__, as part of its parsing, call
 int, index, or float depending on the format code; so the above example
 would have bytes.__format__ calling int() on myNum(17), at which point
 we either have an int type or an exception was raised because myNum
 isn't really an integer.  Once we have an int, whose format we know and
 trust, then we can call its __format__ and proceed from there.
 
 On the flip side, if myNum does define it's own __format__, it will not
 be called by bytes.format, and perhaps that is another good reason for
 bytes to only support %-interpolation and not format?

For the first iteration of bytes.format(), I think we should just
support the exact types of int, float, and bytes. It will call the
type's__format__ (with the object as self) and encode the result to
ASCII. For the stated use case of 2.x compatibility, I suspect this will
cover  90% of the uses in real code. If we find there are cases where
real code needs additional types supported, we can consider adding
__format_ascii__ (or whatever name we cook up).

Eric.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Ethan Furman


On 01/16/2014 10:30 AM, Eric V. Smith wrote:

On 01/16/2014 11:23 AM, Ethan Furman wrote:

On 01/16/2014 06:45 AM, Brett Cannon wrote:


But that's **only** because the numeric types choose
to as part of their __format__() implementation; it is
not inherent to str.format().


As I understand it, str.format will call the object's __format__.  So,
for example, if I say:

   u'the value is: %d' % myNum(17)

then it will be myNum.__format__ that gets called, not int.__format__;
this is precisely what we don't want, since can't know that myNum is
only going to return ASCII characters.


Magic methods, including __format__, are called on the type, not the
instance.


Yes, that's why I said `myNum(17)` and not `myNum`.



This is why I would have bytes.__format__, as part of its parsing, call
int, index, or float depending on the format code; so the above example
would have bytes.__format__ calling int() on myNum(17), at which point
we either have an int type or an exception was raised because myNum
isn't really an integer.  Once we have an int, whose format we know and
trust, then we can call its __format__ and proceed from there.

On the flip side, if myNum does define it's own __format__, it will not
be called by bytes.format, and perhaps that is another good reason for
bytes to only support %-interpolation and not format?


For the first iteration of bytes.format(), I think we should just
support the exact types of int, float, and bytes. It will call the
type's__format__ (with the object as self) and encode the result to
ASCII. For the stated use case of 2.x compatibility, I suspect this will
cover  90% of the uses in real code. If we find there are cases where
real code needs additional types supported, we can consider adding
__format_ascii__ (or whatever name we cook up).


That can certainly be our fallback position if we can't decide now how we want 
to handle int and float subclasses.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Eric V. Smith

On 01/16/2014 01:55 PM, Ethan Furman wrote:
 Magic methods, including __format__, are called on the type, not the
 instance.
 
 Yes, that's why I said `myNum(17)` and not `myNum`.

Oops, apologies. I misread the code.

Eric.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Jan Kaliszewski


16.01.2014 17:33, Michael Urman wrote:

On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon br...@python.org 
wrote:
Fine, if you're worried about bytes.format() overstepping by 
implicitly
calling str.encode() on the return value of __format__() then you 
will need

__bytes__format__() to get equivalent support.


Could we just re-use PEP-3101's note (easily updated for Python 3):

Note for Python 2.x: The 'format_spec' argument will be either
a string object or a unicode object, depending on the type of the
original format string.  The __format__ method should test the 
type
of the specifiers parameter to determine whether to return a 
string or
unicode object.  It is the responsibility of the __format__ 
method

to return an object of the proper type.

If __format__ receives a format_spec of type bytes, it should return
bytes. For such cases on objects that cannot support bytes (i.e. for
str), it can raise. This appears to avoid the need for additional
methods. (As does Nick's proposal of leaving it out for now.)


-1.

I'd treat the format()+.__format__()+str.format()-ecosystem as
a nice text-data-oriented, *complete* Py3k feature, backported to
Python 2 to share the benefits of the feature with it as well as
to make the 2-to-3 transition a bit easier.

IMHO, the PEP-3101's note cited above just describes a workaround
over the flaws of the Py2's obsolete text model.  Moving such
complications into Py3k would make the feature (and especially the
ability to implement your own .__format__()) harder to understand
and make use of -- for little profit.

Such a move is not needed for compatibility.  And, IMHO, the
format()/__format__()/str.format()-matter is all about nice and
flexible *text* formatting, not about binary data interpolation.

16.01.2014 10:56, Nick Coghlan wrote:


I have a different proposal: let's *just* add mod formatting to
bytes, and leave the extensible formatting system as a text only
operation.

We don't really care if bytes supports that method for version
compatibility purposes, and the deliberate flexibility of the design
makes it hard to translate into the binary domain.

So let's just not provide that - let's accept that, for the binary
domain, printf style formatting is just a better fit for the job :)


+1!

However, I am not sure if %s should be limited to bytes-like
objects.  As practicality beats purity, I would be +0.5 for
enabling the following:

- input type supports Py_buffer?
  use it to collect the necessary bytes

- input type has the __bytes__() method?
  use it to collect the necessary bytes

- input type has the encode() method?
  raise TypeError

- otherwise:
  use something equivalent to ascii(obj).encode('ascii')
  (note that it would nicely format numbers + format other
  object in more-or-less useful way without the fear of
  encountering a non-ascii data).

  another option: use str()-representation of strictly
  defined types, e.g.: int, float, decimal.Decimal,
  fractions.Fraction...

Cheers.
*j

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Steven D'Aprano

On Thu, Jan 16, 2014 at 08:23:13AM -0800, Ethan Furman wrote:

 As I understand it, str.format will call the object's __format__.  So, for 
 example, if I say:
 
   u'the value is: %d' % myNum(17)
 
 then it will be myNum.__format__ that gets called, not int.__format__; 

I seem to have missed something, because I am completely confused... Why 
are you talking about str.format and then show an example using % instead?

%d calls __str__, not __format__. This is in Python 3.3:

py class MyNum(int):
... def __str__(self):
... print(Calling MyNum.__str__)
... return super().__str__()
... def __format__(self):
... print(Calling MyNum.__format__)
... return super().__format__()
...
py n = MyNum(17)
py u%d % n
Calling MyNum.__str__
'17'


By analogy, if we have a bytes %d formatting, surely it should either:

(1) call type(n).__bytes__(n), which is guaranteed to raise if the 
result isn't ASCII (i.e. like len() raises if the result isn't 
an int); or

(2) call type(n).__str__(n).encode(ascii, strict).


Personally, I lean towards (2), even though that means you can't have a 
single class provide an ASCII string to b'%d' and a non-ASCII string to 
u'%d'. 


 this 
 is precisely what we don't want, since can't know that myNum is only going 
 to return ASCII characters.

It seems to me that Consenting Adults applies here. If class MyNum 
returns a non-ASCII string, then you ought to get a runtime exception, 
exactly the same as happens with just about every other failure in 
Python. If you don't want that possible exception, then don't use MyNum, 
or explicitly wrap it in a call to int:

b'the value is: %d' % int(MyNum(17))

The *worst* solution would be to completely ignore MyNum.__str__. 
That's a nasty violation of the Principle Of Least Surprise, and will 
lead to confusion (why isn't my class' __str__ method being called?) 
and bugs.

* Explicit is better than implicit -- better to explicitly 
  wrap MyNum in a call to int() than to have bytes %d 
  automagically do it for you;

* Special cases aren't special enough to break the rules -- 
  bytes %d isn't so special that standard Python rules about
  calling special methods should be ignored;

* Errors should never pass silently -- if MyNum does the wrong 
  thing when used with bytes %d, you should get an exception.


 This is why I would have bytes.__format__, as part of its parsing, call 
 int, index, or float depending on the format code; so the above example 
 would have bytes.__format__ calling int() on myNum(17), 

The above example you give doesn't have any bytes in it. Can you explain 
what you meant to say? I'm guessing you intended this:

b'the value is: %d' % MyNum(17)

rather than using u'' as actually given, but I don't really know.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Brett Cannon

bytes.format() below. I'll leave it to you to decide if they warrant using,
leaving as an open question, or rejecting.


On Tue, Jan 14, 2014 at 2:56 PM, Ethan Furman et...@stoneleaf.us wrote:

 Duh.  Here's the text, as well.  ;)


 PEP: 461
 Title: Adding % and {} formatting to bytes
 Version: $Revision$
 Last-Modified: $Date$
 Author: Ethan Furman et...@stoneleaf.us
 Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 2014-01-13
 Python-Version: 3.5
 Post-History: 2014-01-13
 Resolution:


 Abstract
 

 This PEP proposes adding the % and {} formatting operations from str to
 bytes.


 Proposed semantics for bytes formatting
 ===

 %-interpolation
 ---

 All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.)
 will be supported, and will work as they do for str, including the
 padding, justification and other related modifiers.

 Example::

 b'%4x' % 10
b'   a'

 %c will insert a single byte, either from an int in range(256), or from
 a bytes argument of length 1.

 Example:

  b'%c' % 48
 b'0'

  b'%c' % b'a'
 b'a'

 %s, because it is the most general, has the most convoluted resolution:

   - input type is bytes?
 pass it straight through

   - input type is numeric?
 use its __xxx__ [1] [2] method and ascii-encode it (strictly)

   - input type is something else?
 use its __bytes__ method; if there isn't one, raise an exception [3]

 Examples:

  b'%s' % b'abc'
 b'abc'

  b'%s' % 3.14
 b'3.14'

  b'%s' % 'hello world!'
 Traceback (most recent call last):
 ...
 TypeError: 'hello world' has no __bytes__ method, perhaps you need to
 encode it?

 .. note::

Because the str type does not have a __bytes__ method, attempts to
directly use 'a string' as a bytes interpolation value will raise an
exception.  To use 'string' values, they must be encoded or otherwise
transformed into a bytes sequence::

   'a string'.encode('latin-1')


 format
 --

 The format mini language will be used as-is, with the behaviors as listed
 for %-interpolation.


That's too vague; % interpolation does not support other format operators
in the same way as str.format() does. % interpolation has specific code to
support %d, etc. But str.format() gets supported for {:d} not from special
code but because e.g. float.__format__('d') works. So you can't say
bytes.format() supports {:d} just like %d works with string interpolation
since the mechanisms are fundamentally different.

This is why I have argued that if you specify it as if there is a format
spec specified, then the return value from calling __format__() will have
str.decode('ascii', 'strict') called on it you get the support for the
various number-specific format specs for free. It also means if you pass in
a string that you just want the strict ASCII bytes of then you can get it
with {:s}.

I also think that a 'b' conversion be added to bytes.format(). This doesn't
have the same issue as %b if you make {} implicitly mean {!b} in Python 3.5
as {} will mean what is the most accurate for bytes.format() in either
version. It also allows for explicit support where you know you only want a
byte and allows {!s} to mean you only want a string (and thus throw an
error otherwise).

And all of this means that much like %s only taking bytes, the only way for
bytes.format() to accept a non-byte argument is for some format spec to be
specified to trigger the .encode('ascii', 'strict') call.

-Brett




 Open Questions
 ==

 For %s there has been some discussion of trying to use the buffer protocol
 (Py_buffer) before trying __bytes__.  This question should be answered
 before
 the PEP is implemented.


 Proposed variations
 ===

 It has been suggested to use %b for bytes instead of %s.

   - Rejected as %b does not exist in Python 2.x %-interpolation, which is
 why we are using %s.

 It has been proposed to automatically use .encode('ascii','strict') for str
 arguments to %s.

   - Rejected as this would lead to intermittent failures.  Better to have
 the
 operation always fail so the trouble-spot can be correctly fixed.

 It has been proposed to have %s return the ascii-encoded repr when the
 value
 is a str  (b'%s' % 'abc'  -- b'abc').

   - Rejected as this would lead to hard to debug failures far from the
 problem
 site.  Better to have the operation always fail so the trouble-spot
 can be
 easily fixed.


 Foot notes
 ==

 .. [1] Not sure if this should be the numeric __str__ or the numeric
 __repr__,
or if there's any difference
 .. [2] Any proper numeric class would then have to provide an ascii
representation of its value, either via __repr__ or __str__
 (whichever
we choose in [1]).
 .. [3] TypeError, ValueError, or UnicodeEncodeError?


 Copyright
 =

 This document has been placed in the public domain.


 ..
Local

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Neil Schemenauer

This looks pretty good to me.  I don't think we should limit
operands based on type, that's anti-Pythonic IMHO.  We should use
duck-typing and that means a special method, I think.  We could
introduce a new one but __bytes__ looks like it can work.
Otherwise, maybe __ascii__ is a good name.

Objects that implement __str__ can also implement __bytes__ if they
can guarantee that ASCII characters are always returned, no matter
what the *value* (we don't want to repeat the hell of Python 2's
unicode to str coercion which depends on the value of the unicode
object).  Objects that already contain encoded bytes or arbitrary
bytes can also implement __bytes__.

Ethan Furman et...@stoneleaf.us wrote:
 %s, because it is the most general, has the most convoluted resolution:

This becomes much simpler:

 - does the object implement __bytes__?
   call it and use the value otherwise raise TypeError

 It has been suggested to use %b for bytes instead of %s.

- Rejected as %b does not exist in Python 2.x %-interpolation, which is
  why we are using %s.

+1.  %b might be conceptually neater but ease of migration trumps
that, IMHO.

 It has been proposed to automatically use .encode('ascii','strict') for str
 arguments to %s.

- Rejected as this would lead to intermittent failures.  Better to have the
  operation always fail so the trouble-spot can be correctly fixed.

Right.  That would put us back in Python 2 unicode - str coercion
hell.

Thanks for writing the PEP.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Eric V. Smith

On 1/15/2014 9:45 AM, Brett Cannon wrote:

 That's too vague; % interpolation does not support other format
 operators in the same way as str.format() does. % interpolation has
 specific code to support %d, etc. But str.format() gets supported for
 {:d} not from special code but because e.g. float.__format__('d') works.
 So you can't say bytes.format() supports {:d} just like %d works with
 string interpolation since the mechanisms are fundamentally different.
 
 This is why I have argued that if you specify it as if there is a
 format spec specified, then the return value from calling __format__()
 will have str.decode('ascii', 'strict') called on it you get the
 support for the various number-specific format specs for free. It also
 means if you pass in a string that you just want the strict ASCII bytes
 of then you can get it with {:s}.
 
 I also think that a 'b' conversion be added to bytes.format(). This
 doesn't have the same issue as %b if you make {} implicitly mean {!b} in
 Python 3.5 as {} will mean what is the most accurate for bytes.format()
 in either version. It also allows for explicit support where you know
 you only want a byte and allows {!s} to mean you only want a string (and
 thus throw an error otherwise).
 
 And all of this means that much like %s only taking bytes, the only way
 for bytes.format() to accept a non-byte argument is for some format spec
 to be specified to trigger the .encode('ascii', 'strict') call.

Agreed. With %-formatting, you can start with the format strings and
then decide what you want to do with the passed in objects. But with
.format, it's the other way around: you have to look at the passed in
objects being formatted, and then decide what the format specifier means
to that type.

So, for .format, you could say hey, that object's an int, and I happen
to know how to format ints, outside of calling it's .__format__. Or you
could even call its __format__ because you know that it will only be
ASCII. But to take this approach, you're going to have to hard-code the
types. And subclasses are probably out, since there you don't know what
the subclass's __format__ will return. It could be non-ASCII.

 class Int(int):
...   def __format__(self, fmt):
... return u'foo'
...
 '{}'.format(Int(3))
'foo'

So basically I think we'll have to hard-code the types that .format()
will support, and never call __format__, or only call __format__ if we
know that it's a exact type where we know that __format__ will return
(strict ASCII).

Either that, or we're back to encoding the result of __format__ and
accepting that sometimes it might throw errors, depending on the values
being passed into format().

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/15/2014 06:45 AM, Brett Cannon wrote:

bytes.format() below. I'll leave it to you to decide if they warrant using, 
leaving as an open question, or rejecting.


Thanks for your comments.  I've only barely touched format, so it's not an area 
of strength for me.  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Antoine Pitrou

On Wed, 15 Jan 2014 15:47:43 + (UTC)
Neil Schemenauer n...@arctrix.com wrote:
 
 Objects that implement __str__ can also implement __bytes__ if they
 can guarantee that ASCII characters are always returned, no matter
 what the *value*

I think that's a slippery slope. __bytes__ should mean that the object
has a well-known bytes equivalent or encoding, not that its __str__
happens to be pure ASCII.

(for example, it would be fine for a HTTP message class to define a
__bytes__ method)

Also, consider that if e.g. float had a __bytes__ method, then
bytes(2.0) would start returning b'2.0', while bytes(2) would still
need to return b'\x00\x00'.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Neil Schemenauer

Neil Schemenauer n...@arctrix.com wrote:
 We should use duck-typing and that means a special method, I
 think.  We could introduce a new one but __bytes__ looks like it
 can work.  Otherwise, maybe __ascii__ is a good name.

I poked around the Python 3 source.  Using __bytes__ has some
downsides, e.g. the following would happen:

 bytes(12)
b'12'

Perhaps that's a little too ASCII-centric.  OTOH, UTF-8 seems to be
winning the encoding war and so the above could be argued as
reasonable behavior.  I think forcing people to explicitly choose an
encoding for str objects will be sufficient to avoid the bytes/str
mess we have in Python 2.

Unfortunately, that change conflicts with the current behavior:

 bytes(12)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

Would it be too disruptive to change that?  It doesn't appear to be
too useful and we could do it using a keyword argument, e.g.:

bytes(size=12)

I notice something else surprising to me:

 class Test(object):
... def __bytes__(self):
... return b'test'
...
 with open('test', 'wb') as fp:
... fp.write(Test())
...
Traceback (most recent call last):
  File stdin, line 2, in module
TypeError: 'Test' does not support the buffer interface

I'd expect that to write b'test' to the file, not raise an error.

Regards,

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/14/2014 02:41 PM, Mark Lawrence wrote:

On 14/01/2014 19:55, Ethan Furman wrote:

This PEP goes a but further than PEP 460 does, and hopefully spells
things out in enough detail so there is no confusion as to what is meant.

--
~Ethan~


Out of plain old curiosity do we have to consider PEP 292 string templates in 
any way, shape or form, or regarding this
debate have they been safely booted into touch?


Well, I'm not sure what booted into touch means, but yes, we can ignore 
string templates.  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/15/2014 08:04 AM, Antoine Pitrou wrote:

On Wed, 15 Jan 2014 15:47:43 + (UTC)
Neil Schemenauer n...@arctrix.com wrote:


Objects that implement __str__ can also implement __bytes__ if they
can guarantee that ASCII characters are always returned, no matter
what the *value*


I think that's a slippery slope. __bytes__ should mean that the object
has a well-known bytes equivalent or encoding, not that its __str__
happens to be pure ASCII.


Agreed.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/15/2014 07:47 AM, Neil Schemenauer wrote:


Thanks for writing the PEP.


Thank you for your comments!

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Mark Lawrence


On 14/01/2014 19:56, Ethan Furman wrote:

Duh.  Here's the text, as well.  ;)

%s, because it is the most general, has the most convoluted resolution:

   - input type is bytes?
 pass it straight through

   - input type is numeric?
 use its __xxx__ [1] [2] method and ascii-encode it (strictly)

   - input type is something else?
 use its __bytes__ method; if there isn't one, raise an exception [3]

Examples:

  b'%s' % b'abc'
 b'abc'

  b'%s' % 3.14
 b'3.14'

  b'%s' % 'hello world!'
 Traceback (most recent call last):
 ...
 TypeError: 'hello world' has no __bytes__ method, perhaps you need
to encode it?



For completeness I believe %r and %a should be included here as well. 
FTR %a appears to have been introduced in 3.2, but I couldn't find 
anything in the What's New and there's no note in the docs 
http://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting 
to indicate when it first came into play.


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Brett Cannon

On Wed, Jan 15, 2014 at 10:52 AM, Eric V. Smith e...@trueblade.com wrote:

 On 1/15/2014 9:45 AM, Brett Cannon wrote:

  That's too vague; % interpolation does not support other format
  operators in the same way as str.format() does. % interpolation has
  specific code to support %d, etc. But str.format() gets supported for
  {:d} not from special code but because e.g. float.__format__('d') works.
  So you can't say bytes.format() supports {:d} just like %d works with
  string interpolation since the mechanisms are fundamentally different.
 
  This is why I have argued that if you specify it as if there is a
  format spec specified, then the return value from calling __format__()
  will have str.decode('ascii', 'strict') called on it you get the
  support for the various number-specific format specs for free. It also
  means if you pass in a string that you just want the strict ASCII bytes
  of then you can get it with {:s}.
 
  I also think that a 'b' conversion be added to bytes.format(). This
  doesn't have the same issue as %b if you make {} implicitly mean {!b} in
  Python 3.5 as {} will mean what is the most accurate for bytes.format()
  in either version. It also allows for explicit support where you know
  you only want a byte and allows {!s} to mean you only want a string (and
  thus throw an error otherwise).
 
  And all of this means that much like %s only taking bytes, the only way
  for bytes.format() to accept a non-byte argument is for some format spec
  to be specified to trigger the .encode('ascii', 'strict') call.

 Agreed. With %-formatting, you can start with the format strings and
 then decide what you want to do with the passed in objects. But with
 .format, it's the other way around: you have to look at the passed in
 objects being formatted, and then decide what the format specifier means
 to that type.

 So, for .format, you could say hey, that object's an int, and I happen
 to know how to format ints, outside of calling it's .__format__. Or you
 could even call its __format__ because you know that it will only be
 ASCII. But to take this approach, you're going to have to hard-code the
 types. And subclasses are probably out, since there you don't know what
 the subclass's __format__ will return. It could be non-ASCII.

  class Int(int):
 ...   def __format__(self, fmt):
 ... return u'foo'
 ...
  '{}'.format(Int(3))
 'foo'

 So basically I think we'll have to hard-code the types that .format()
 will support, and never call __format__, or only call __format__ if we
 know that it's a exact type where we know that __format__ will return
 (strict ASCII).

 Either that, or we're back to encoding the result of __format__ and
 accepting that sometimes it might throw errors, depending on the values
 being passed into format().


I say accept that an error might get thrown as there is precedent of
specifying a format spec that an object's __format__() method doesn't
recognize::

   '{:s}'.format(1)
  Traceback (most recent call last):
File stdin, line 1, in module
  ValueError: Unknown format code 's' for object of type 'int'

IOW I'm actively trying to avoid type-restricting the semantics for
bytes.format() for a consistent, clear mental model. Remembering that any
format spec leads to calling .encode('ascii', 'strict') on the result is
simple compared to ASCII bytes will be returned for ints and floats when
passed in, otherwise all other types follow these rules.

As the zen says:

  Errors should never pass silently.
  Special cases aren't special enough to break the rules.


-Brett
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Brett Cannon

On Wed, Jan 15, 2014 at 10:57 AM, Ethan Furman et...@stoneleaf.us wrote:

 On 01/15/2014 06:45 AM, Brett Cannon wrote:

 bytes.format() below. I'll leave it to you to decide if they warrant
 using, leaving as an open question, or rejecting.


 Thanks for your comments.  I've only barely touched format, so it's not an
 area of strength for me.  :)


Time to strengthen it if you are proposing a PEP that is going to affect
it. =)



 --
 ~Ethan~

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailman/options/python-dev/
 brett%40python.org

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/15/2014 08:51 AM, Brett Cannon wrote:

On Wed, Jan 15, 2014 at 10:57 AM, Ethan Furman wrote:


Thanks for your comments.  I've only barely touched format, so it's not an area 
of strength for me.  :)


Time to strengthen it if you are proposing a PEP that is going to affect it. =)


I am.  You're helping.  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Isaac Morland


On Wed, 15 Jan 2014, Antoine Pitrou wrote:


On Wed, 15 Jan 2014 15:47:43 + (UTC)
Neil Schemenauer n...@arctrix.com wrote:


Objects that implement __str__ can also implement __bytes__ if they
can guarantee that ASCII characters are always returned, no matter
what the *value*


I think that's a slippery slope. __bytes__ should mean that the object
has a well-known bytes equivalent or encoding, not that its __str__
happens to be pure ASCII.


+1


(for example, it would be fine for a HTTP message class to define a
__bytes__ method)

Also, consider that if e.g. float had a __bytes__ method, then
bytes(2.0) would start returning b'2.0', while bytes(2) would still
need to return b'\x00\x00'.


Not actually suggesting the following for a number of reasons including
but not limited to the consistency of floating point formats across
different implementations, but it would make more sense for bytes (2.0) to
return the 8-byte IEEE representation than for it to return the ASCII
encoding of the decimal representation of the number.

Isaac Morland   CSCF Web Guru
DC 2619, x36650 WWW Software Specialist
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Neil Schemenauer

Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 15 Jan 2014 15:47:43 + (UTC) Neil S wrote:
 
 Objects that implement __str__ can also implement __bytes__ if they
 can guarantee that ASCII characters are always returned, no matter
 what the *value*

 I think that's a slippery slope. __bytes__ should mean that the object
 has a well-known bytes equivalent or encoding, not that its __str__
 happens to be pure ASCII.

After poking around some more into the Python 3 source, I agree.  It
seems too late to change bytes(integer) and bytearray(integer).
We should have used a keyword only argument but too late now (tp_new
is a mess).

I can also agree that pushing the ASCII-centric behavior into the
bytes() constructor goes too far.  If we limit the ASCII-centric
behavior solely to % and format(), that seems a reasonable trade-off
for usability.  As others have argued, once you are using format
codes, you are pretty clearly dealing with ASCII encoding.

I feel strongly that % and format on bytes needs to use duck-typing
and not type checking.  Also, formatting falures must be due to
types and not due to values.  If we can get agreement on these two
principles, that will help guide the design.

Those principles absolutely rule out call calling encode('ascii')
automatically.  I'm not deeply intimate with format() but I think it
also rules out calling __format__.

Could we introduce only __bformat__ and have the % operator call it?
That would only require implementing one new special method instead
of two.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Guido van Rossum

All sounds good.

A fleeting thought about constructors: you can always add alternative
constructors as class methods (like datetime does).

On Wed, Jan 15, 2014 at 10:09 AM, Neil Schemenauer n...@arctrix.com wrote:
 Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 15 Jan 2014 15:47:43 + (UTC) Neil S wrote:

 Objects that implement __str__ can also implement __bytes__ if they
 can guarantee that ASCII characters are always returned, no matter
 what the *value*

 I think that's a slippery slope. __bytes__ should mean that the object
 has a well-known bytes equivalent or encoding, not that its __str__
 happens to be pure ASCII.

 After poking around some more into the Python 3 source, I agree.  It
 seems too late to change bytes(integer) and bytearray(integer).
 We should have used a keyword only argument but too late now (tp_new
 is a mess).

 I can also agree that pushing the ASCII-centric behavior into the
 bytes() constructor goes too far.  If we limit the ASCII-centric
 behavior solely to % and format(), that seems a reasonable trade-off
 for usability.  As others have argued, once you are using format
 codes, you are pretty clearly dealing with ASCII encoding.

 I feel strongly that % and format on bytes needs to use duck-typing
 and not type checking.  Also, formatting falures must be due to
 types and not due to values.  If we can get agreement on these two
 principles, that will help guide the design.

 Those principles absolutely rule out call calling encode('ascii')
 automatically.  I'm not deeply intimate with format() but I think it
 also rules out calling __format__.

 Could we introduce only __bformat__ and have the % operator call it?
 That would only require implementing one new special method instead
 of two.

   Neil

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Glenn Linderman


On 1/15/2014 7:52 AM, Eric V. Smith wrote:

So basically I think we'll have to hard-code the types that .format()
will support, and never call __format__, or only call __format__ if we
know that it's a exact type where we know that __format__ will return
(strict ASCII).

Either that, or we're back to encoding the result of __format__ and
accepting that sometimes it might throw errors, depending on the values
being passed into format().


Looks like you need to invent  __formatb__ to produce only ASCII. 
Objects that have __formatb__ can be formatted by bytes.format.  To 
avoid coding, it could be possible that __formatb__ might be a callable, 
in which case it is called to get the result, or not a callable, in 
which case one calls __format__ and converts the result to ASCII, 
__formatb__ just indicating a guarantee that only ASCII will result.


Or it could be that __formatb__ replaces __format__ and str.__format__, 
if it finds no __format__ looks for __formatb__, calls that, and 
converts the result to Unicode.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Neil Schemenauer

Glenn Linderman v+pyt...@g.nevcal.com wrote:
 On 1/15/2014 7:52 AM, Eric V. Smith wrote:
 Either that, or we're back to encoding the result of __format__ and
 accepting that sometimes it might throw errors, depending on the values
 being passed into format().

That would take us back to Python 2 hell.  Please no.  I don't like
checking for types either, we should have a special method.

 Looks like you need to invent  __formatb__ to produce only ASCII. 
 Objects that have __formatb__ can be formatted by bytes.format.  To 
 avoid coding, it could be possible that __formatb__ might be a callable
 in which case it is called to get the result, or not a callable, in 
 which case one calls __format__ and converts the result to ASCII, 
 __formatb__ just indicating a guarantee that only ASCII will result.

Just do:

def __formatb__(self, spec):
return MyClass.__format__(self, spec).encode('ascii')

Note that I think it is better to explicitly use the __format__
method rather than using self.__format__.  My reasoning is that a
subclass might implement a __format__ that returns non-ASCII
characters.

We don't need a special bytes version of __str__ since the
%-operator can call __formatb__ with the correct format spec.

   Neil

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/15/2014 06:45 AM, Brett Cannon wrote:


I also think that a 'b' conversion be added to bytes.format(). This doesn't 
have the same issue as %b if you make {}
implicitly mean {!b} in Python 3.5 as {} will mean what is the most accurate 
for bytes.format() in either version. It
also allows for explicit support where you know you only want a byte and allows 
{!s} to mean you only want a string (and
thus throw an error otherwise).


Given that !b does not exist in Py2, !s (like %s) has to mean bytes when working with a byte stream.  Given that, !s and 
!b would mean the same thing, so it worth adding !b?


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/15/2014 06:45 AM, Brett Cannon wrote:
The PEP currently says::


format
--

The format mini language will be used as-is, with the behaviors as listed
for %-interpolation.


That's too vague; % interpolation does not support other format operators in 
the same way as str.format() does. %
interpolation has specific code to support %d, etc. But str.format() gets 
supported for {:d} not from special code but
because e.g. float.__format__('d') works. So you can't say bytes.format() 
supports {:d} just like %d works with string
interpolation since the mechanisms are fundamentally different.


A question for anyone that has extensive experience in both %-formatting and .format-formatting:  Would it be possible, 
at least for int and float, to take whatever is in the specifier and convert to %?  Example:


  Weight: {wgt:-07f}.format(wgt=137.23)

would take the -07f and basically do a %-07f % 137.23 to get the ASCII to 
use?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Greg Ewing


Neil Schemenauer wrote:

Objects that implement __str__ can also implement __bytes__ if they
can guarantee that ASCII characters are always returned,


I think __ascii_ would be a better name. I'd expect
a method called __bytes__ on an int to return some
version of its binary value.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Steven D'Aprano

On Thu, Jan 16, 2014 at 10:55:31AM +1300, Greg Ewing wrote:
 Neil Schemenauer wrote:
 Objects that implement __str__ can also implement __bytes__ if they
 can guarantee that ASCII characters are always returned,
 
 I think __ascii_ would be a better name. I'd expect
 a method called __bytes__ on an int to return some
 version of its binary value.

+1


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Eric V. Smith

On 1/15/2014 4:32 PM, Ethan Furman wrote:
 A question for anyone that has extensive experience in both %-formatting
 and .format-formatting:  Would it be possible, at least for int and
 float, to take whatever is in the specifier and convert to %?  Example:
 
   Weight: {wgt:-07f}.format(wgt=137.23)
 
 would take the -07f and basically do a %-07f % 137.23 to get the
 ASCII to use?

I think the int.__format__ version might be a superset. Specifically,
the n and % types. There may well be others.

But I think we could say we're not going to support these in b.format().

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Brett Cannon

On Wed, Jan 15, 2014 at 4:24 PM, Ethan Furman et...@stoneleaf.us wrote:

 On 01/15/2014 06:45 AM, Brett Cannon wrote:


 I also think that a 'b' conversion be added to bytes.format(). This
 doesn't have the same issue as %b if you make {}
 implicitly mean {!b} in Python 3.5 as {} will mean what is the most
 accurate for bytes.format() in either version. It
 also allows for explicit support where you know you only want a byte and
 allows {!s} to mean you only want a string (and
 thus throw an error otherwise).


 Given that !b does not exist in Py2, !s (like %s) has to mean bytes when
 working with a byte stream.  Given that, !s and !b would mean the same
 thing, so it worth adding !b?


I disagree with the assertion. %s has to mean bytes for Python 2
compatibility because there is no equivalent to '{}' (no conversion or
format spec specified); basically %s represents no conversion for the %
operator. But since format() has the concept of a default conversion as
well as explicit conversions you can lean on that fact and let the default
conversion do what makes sense for that version of Python.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Brett Cannon

On Wed, Jan 15, 2014 at 5:00 PM, Steven D'Aprano st...@pearwood.infowrote:

 On Thu, Jan 16, 2014 at 10:55:31AM +1300, Greg Ewing wrote:
  Neil Schemenauer wrote:
  Objects that implement __str__ can also implement __bytes__ if they
  can guarantee that ASCII characters are always returned,
 
  I think __ascii_ would be a better name. I'd expect
  a method called __bytes__ on an int to return some
  version of its binary value.

 +1


If we are going the route of a new magic method then __ascii__ or
__bytes_format__ get my vote as long as they only return bytes (I see no
need to abbreviate to __bformat__ or __formatb__ when we have method names
as long as __text_signature__ now).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/15/2014 08:33 AM, Mark Lawrence wrote:


For completeness I believe %r and %a should be included here as well.


Good point.  Done.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Mark Lawrence


On 15/01/2014 22:22, Brett Cannon wrote:




On Wed, Jan 15, 2014 at 5:00 PM, Steven D'Aprano st...@pearwood.info
mailto:st...@pearwood.info wrote:

On Thu, Jan 16, 2014 at 10:55:31AM +1300, Greg Ewing wrote:
  Neil Schemenauer wrote:
  Objects that implement __str__ can also implement __bytes__ if they
  can guarantee that ASCII characters are always returned,
 
  I think __ascii_ would be a better name. I'd expect
  a method called __bytes__ on an int to return some
  version of its binary value.

+1


If we are going the route of a new magic method then __ascii__ or
__bytes_format__ get my vote as long as they only return bytes (I see no
need to abbreviate to __bformat__ or __formatb__ when we have method
names as long as __text_signature__ now).



__bytes_format__ gets my vote as it's blatantly obvious what it does. 
I'm against __ascii__ as I'd automatically associate that with ascii in 
the same way that I associate str with __str__ and repr with __repr__.


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Steven D'Aprano

On Wed, Jan 15, 2014 at 10:34:48PM +, Mark Lawrence wrote:
 On 15/01/2014 22:22, Brett Cannon wrote:
 
 
 
 On Wed, Jan 15, 2014 at 5:00 PM, Steven D'Aprano st...@pearwood.info
 mailto:st...@pearwood.info wrote:
 
 On Thu, Jan 16, 2014 at 10:55:31AM +1300, Greg Ewing wrote:
   Neil Schemenauer wrote:
   Objects that implement __str__ can also implement __bytes__ if they
   can guarantee that ASCII characters are always returned,
  
   I think __ascii_ would be a better name. I'd expect
   a method called __bytes__ on an int to return some
   version of its binary value.
 
 +1
 
 
 If we are going the route of a new magic method then __ascii__ or
 __bytes_format__ get my vote as long as they only return bytes (I see no
 need to abbreviate to __bformat__ or __formatb__ when we have method
 names as long as __text_signature__ now).
 
 
 __bytes_format__ gets my vote as it's blatantly obvious what it does. 

What precisely does it do? If it's so obvious, why is this thread so 
long?


 I'm against __ascii__ as I'd automatically associate that with ascii in 
 the same way that I associate str with __str__ and repr with __repr__.

That's a good point. I forgot about ascii().


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Glenn Linderman


On 1/15/2014 4:03 PM, Steven D'Aprano wrote:

What precisely does it do? If it's so obvious, why is this thread so
long?


It produces a formatted representation of the object in bytes.  For 
numbers, that would probably be expected to be ASCII digits and punctuation.


But other items are not as obvious.

bytes would probably be expected not to have a __bytes_format__, but if 
a subclass defined one, it might be HEX or Base64 of the base bytes. Or 
if the subclass is ASCII text oriented, it might be the ASCII text 
version of the base bytes (which would be identical to the base bytes, 
except for the type transformation).


str  would probably be expected not to have a __bytes_format__,  but if 
a subclass defined one, it might be HEX or Base64, or it might be a 
specific encoding of the base str.


Other objects might generate an ASCII __repr__, if they define the method.


It took a lot of talk to reach the conclusion, if it has been reached, 
that none of the solution are general enough without defining something 
like __bytes_format__. And before that, a lot of talk to decide that % 
interpolation already had an ASCII bias.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Steven D'Aprano

On Wed, Jan 15, 2014 at 05:46:07PM -0800, Glenn Linderman wrote:
 On 1/15/2014 4:03 PM, Steven D'Aprano wrote:
 What precisely does it do? If it's so obvious, why is this thread so
 long?
 
 It produces a formatted representation of the object in bytes.  For 
 numbers, that would probably be expected to be ASCII digits and punctuation.
 
 But other items are not as obvious.

My point exactly.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Greg Ewing


Ethan Furman wrote:

Well, I'm not sure what booted into touch means,


It's a rugby term, referring to kicking the ball
over the touch line.

As a metaphor, it seems to mean making a problem
go away.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-15 Thread Ethan Furman


On 01/15/2014 06:45 AM, Brett Cannon wrote:


This is why I have argued that if you specify it as if there is a format spec 
specified, then the return value from
calling __format__() will have str.decode('ascii', 'strict') called on it you 
get the support for the various
number-specific format specs for free.


It may work like this under the hood, but it's an implementation detail.  Since the numeric format codes will call int, 
index, or float on the object (to handle subclasses), we could then call __format__ on the resulting int or float to do 
the heavy lifting; but since __format__ on anything else would never be called I don't want to give that impression.



It also means if you pass in a string that you just want the strict ASCII bytes
of then you can get it with {:s}.


This isn't going to happen.  If the user wants a string to be in the byte stream, it has to either be a bytes literal or 
explicitly encoded [1].


--
~Ethan~

[1] Apologies if this has already been answered.  I wanted to make sure I responded to all the ideas/objects, and I may 
have responded more than once to some.  It's been a long few threads.  ;)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Ethan Furman


Duh.  Here's the text, as well.  ;)


PEP: 461
Title: Adding % and {} formatting to bytes
Version: $Revision$
Last-Modified: $Date$
Author: Ethan Furman et...@stoneleaf.us
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2014-01-13
Python-Version: 3.5
Post-History: 2014-01-13
Resolution:


Abstract


This PEP proposes adding the % and {} formatting operations from str to bytes.


Proposed semantics for bytes formatting
===

%-interpolation
---

All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.)
will be supported, and will work as they do for str, including the
padding, justification and other related modifiers.

Example::

b'%4x' % 10
   b'   a'

%c will insert a single byte, either from an int in range(256), or from
a bytes argument of length 1.

Example:

 b'%c' % 48
b'0'

 b'%c' % b'a'
b'a'

%s, because it is the most general, has the most convoluted resolution:

  - input type is bytes?
pass it straight through

  - input type is numeric?
use its __xxx__ [1] [2] method and ascii-encode it (strictly)

  - input type is something else?
use its __bytes__ method; if there isn't one, raise an exception [3]

Examples:

 b'%s' % b'abc'
b'abc'

 b'%s' % 3.14
b'3.14'

 b'%s' % 'hello world!'
Traceback (most recent call last):
...
TypeError: 'hello world' has no __bytes__ method, perhaps you need to 
encode it?

.. note::

   Because the str type does not have a __bytes__ method, attempts to
   directly use 'a string' as a bytes interpolation value will raise an
   exception.  To use 'string' values, they must be encoded or otherwise
   transformed into a bytes sequence::

  'a string'.encode('latin-1')


format
--

The format mini language will be used as-is, with the behaviors as listed
for %-interpolation.


Open Questions
==

For %s there has been some discussion of trying to use the buffer protocol
(Py_buffer) before trying __bytes__.  This question should be answered before
the PEP is implemented.


Proposed variations
===

It has been suggested to use %b for bytes instead of %s.

  - Rejected as %b does not exist in Python 2.x %-interpolation, which is
why we are using %s.

It has been proposed to automatically use .encode('ascii','strict') for str
arguments to %s.

  - Rejected as this would lead to intermittent failures.  Better to have the
operation always fail so the trouble-spot can be correctly fixed.

It has been proposed to have %s return the ascii-encoded repr when the value
is a str  (b'%s' % 'abc'  -- b'abc').

  - Rejected as this would lead to hard to debug failures far from the problem
site.  Better to have the operation always fail so the trouble-spot can be
easily fixed.


Foot notes
==

.. [1] Not sure if this should be the numeric __str__ or the numeric __repr__,
   or if there's any difference
.. [2] Any proper numeric class would then have to provide an ascii
   representation of its value, either via __repr__ or __str__ (whichever
   we choose in [1]).
.. [3] TypeError, ValueError, or UnicodeEncodeError?


Copyright
=

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Ethan Furman

This PEP goes a but further than PEP 460 does, and hopefully spells things out in enough detail so there is no confusion 
as to what is meant.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Antoine Pitrou

On Tue, 14 Jan 2014 11:56:25 -0800
Ethan Furman et...@stoneleaf.us wrote:
 
 %s, because it is the most general, has the most convoluted resolution:
 
- input type is bytes?
  pass it straight through

It should try to get a Py_buffer instead.

- input type is numeric?
  use its __xxx__ [1] [2] method and ascii-encode it (strictly)

What is the definition of numeric?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Brett Cannon

On Tue, Jan 14, 2014 at 2:55 PM, Ethan Furman et...@stoneleaf.us wrote:

 This PEP goes a but further than PEP 460 does, and hopefully spells things
 out in enough detail so there is no confusion as to what is meant.


Are we going down the PEP route with the various ideas? Guido, do you want
one from me as well or should I not bother?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Ethan Furman


On 01/14/2014 01:05 PM, Brett Cannon wrote:

On Tue, Jan 14, 2014 at 2:55 PM, Ethan Furman wrote:


This PEP goes a but further than PEP 460 does, and hopefully spells
things out in enough detail so there is no confusion as to what is
 meant.


Are we going down the PEP route with the various ideas? Guido, do
 you want one from me as well or should I not bother?


While I can't answer for Guido, I will say I authored this PEP because Antoine didn't want 460 to be any more liberal 
than it already was.


If you collect your ideas together, I'll add them to 461 as questions or discussions or however is appropriate (assuming 
you're willing to go that route).


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Ethan Furman


On 01/14/2014 12:57 PM, Antoine Pitrou wrote:

On Tue, 14 Jan 2014 11:56:25 -0800
Ethan Furman et...@stoneleaf.us wrote:


%s, because it is the most general, has the most convoluted resolution:

- input type is bytes?
  pass it straight through


It should try to get a Py_buffer instead.


Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and 
this should be the first thing we try?

Sounds good.

For that matter, should the first test be does this object support Py_buffer and not worry about it being 
isinstance(obj, bytes)?




- input type is numeric?
  use its __xxx__ [1] [2] method and ascii-encode it (strictly)


What is the definition of numeric?


That is a key question.

Obviously we have int, float, and complex.  We also have Decimal.

But what about Fraction?  Or some users numeric class that doesn't inherit from a core numeric type?  Wherever we draw 
the line, we need to make it's well-documented.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Yury Selivanov

On January 14, 2014 at 4:36:00 PM, Ethan Furman (et...@stoneleaf.us) wrote:
 
 On 01/14/2014 12:57 PM, Antoine Pitrou wrote:
  On Tue, 14 Jan 2014 11:56:25 -0800
  Ethan Furman wrote:
 
  %s, because it is the most general, has the most convoluted 
 resolution:
 
  - input type is bytes?
  pass it straight through
 
  It should try to get a Py_buffer instead.
 
 Meaning any bytes or bytes-subtype will support the Py_buffer 
 protocol, and this should be the first thing we try?
 
 Sounds good.
 
 For that matter, should the first test be does this object support 
 Py_buffer and not worry about it being
 isinstance(obj, bytes)?
 
 
  - input type is numeric?
  use its __xxx__ [1] [2] method and ascii-encode it (strictly) 
 
  What is the definition of numeric?
 
 That is a key question.

isinstance(o, numbers.Number) ?

Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Antoine Pitrou

On Tue, 14 Jan 2014 13:07:57 -0800
Ethan Furman et...@stoneleaf.us wrote:
 
 Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and 
 this should be the first thing we try?
 
 Sounds good.
 
 For that matter, should the first test be does this object support 
 Py_buffer and not worry about it being 
 isinstance(obj, bytes)?

Yes, unless the implementation wants to micro-optimize stuff.

  - input type is numeric?
use its __xxx__ [1] [2] method and ascii-encode it (strictly)
 
  What is the definition of numeric?
 
 That is a key question.
 
 Obviously we have int, float, and complex.  We also have Decimal.

The question is also how do you test for them? Decimal is not a core
builtin type. Do we need some kind of __bformat__ protocol?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Nick Coghlan

On 15 Jan 2014 07:36, Ethan Furman et...@stoneleaf.us wrote:

 On 01/14/2014 12:57 PM, Antoine Pitrou wrote:

 On Tue, 14 Jan 2014 11:56:25 -0800
 Ethan Furman et...@stoneleaf.us wrote:


 %s, because it is the most general, has the most convoluted resolution:

 - input type is bytes?
   pass it straight through


 It should try to get a Py_buffer instead.


 Meaning any bytes or bytes-subtype will support the Py_buffer protocol,
and this should be the first thing we try?

 Sounds good.

 For that matter, should the first test be does this object support
Py_buffer and not worry about it being isinstance(obj, bytes)?

Yep. I actually suggest adjusting the %s handling to:

- interpolate Py_buffer exporters directly
- interpolate __bytes__ if defined
- reject anything with an encode method
- otherwise interpolate str(obj).encode(ascii)

 - input type is numeric?
   use its __xxx__ [1] [2] method and ascii-encode it (strictly)


 What is the definition of numeric?


 That is a key question.

As suggested above, I would flip the question and explicitly *disallow*
implicit encoding of any object with its own encode method, while
allowing everything else.

Cheers,
Nick.


 Obviously we have int, float, and complex.  We also have Decimal.

 But what about Fraction?  Or some users numeric class that doesn't
inherit from a core numeric type?  Wherever we draw the line, we need to
make it's well-documented.

 --
 ~Ethan~

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Ethan Furman


On 01/14/2014 02:17 PM, Nick Coghlan wrote:


On 15 Jan 2014 07:36, Ethan Furman et...@stoneleaf.us 
mailto:et...@stoneleaf.us wrote:


On 01/14/2014 12:57 PM, Antoine Pitrou wrote:


On Tue, 14 Jan 2014 11:56:25 -0800
Ethan Furman et...@stoneleaf.us mailto:et...@stoneleaf.us wrote:



%s, because it is the most general, has the most convoluted resolution:

- input type is bytes?
  pass it straight through



It should try to get a Py_buffer instead.



Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and 
this should be the first thing we try?

Sounds good.

For that matter, should the first test be does this object support Py_buffer 
and not worry about it being isinstance(obj, bytes)?


Yep. I actually suggest adjusting the %s handling to:

- interpolate Py_buffer exporters directly
- interpolate __bytes__ if defined
- reject anything with an encode method
- otherwise interpolate str(obj).encode(ascii)


- input type is numeric?
  use its __xxx__ [1] [2] method and ascii-encode it (strictly)



What is the definition of numeric?



That is a key question.


As suggested above, I would flip the question and explicitly *disallow* 
implicit encoding of any object with its own
encode method, while allowing everything else.


Um, int and floats (for example) don't have an .encode method, don't export Py_buffer, don't have a __bytes__ method... 
Ah! so it would hit the last case, I see.


The danger I see with that route is that any ol' object could then make it into the byte stream, and considering what 
byte streams are for I think we should make the barrier for entry higher than just relying on a __str__ or __repr__.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Nick Coghlan

On 15 Jan 2014 08:23, Ethan Furman et...@stoneleaf.us wrote:

 On 01/14/2014 02:17 PM, Nick Coghlan wrote:


 On 15 Jan 2014 07:36, Ethan Furman et...@stoneleaf.us mailto:
et...@stoneleaf.us wrote:


 On 01/14/2014 12:57 PM, Antoine Pitrou wrote:


 On Tue, 14 Jan 2014 11:56:25 -0800
 Ethan Furman et...@stoneleaf.us mailto:et...@stoneleaf.us wrote:



 %s, because it is the most general, has the most convoluted
resolution:

 - input type is bytes?
   pass it straight through



 It should try to get a Py_buffer instead.



 Meaning any bytes or bytes-subtype will support the Py_buffer protocol,
and this should be the first thing we try?

 Sounds good.

 For that matter, should the first test be does this object support
Py_buffer and not worry about it being isinstance(obj, bytes)?


 Yep. I actually suggest adjusting the %s handling to:

 - interpolate Py_buffer exporters directly
 - interpolate __bytes__ if defined
 - reject anything with an encode method
 - otherwise interpolate str(obj).encode(ascii)

 - input type is numeric?
   use its __xxx__ [1] [2] method and ascii-encode it (strictly)



 What is the definition of numeric?



 That is a key question.


 As suggested above, I would flip the question and explicitly *disallow*
implicit encoding of any object with its own
 encode method, while allowing everything else.


 Um, int and floats (for example) don't have an .encode method, don't
export Py_buffer, don't have a __bytes__ method... Ah! so it would hit the
last case, I see.

 The danger I see with that route is that any ol' object could then make
it into the byte stream, and considering what byte streams are for I think
we should make the barrier for entry higher than just relying on a __str__
or __repr__.

Yeah, reading the other thread pointed out the issues with this idea
(containers in particular are a problem).

I think Brett has the right idea: we shouldn't try to accept numbers for %s
in binary interpolation. If we limit it to just buffer exporters and
objects with a __bytes__ method then the problem goes away.

The numeric codes all exist in Python 2, so the porting requirement to the
common 2/3 subset will be to update the cases of binary interpolation of a
number with %s to use an appropriate numeric formatting code instead.

Cheers,
Nick.



 --
 ~Ethan~
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Guido van Rossum

I think of PEP 460 as the strict version and PEP 461 as the lenient
version. I don't think it makes sense to have more variants. So please
collaborate with whichever you like best. :-)

On Tue, Jan 14, 2014 at 1:11 PM, Ethan Furman et...@stoneleaf.us wrote:
 On 01/14/2014 01:05 PM, Brett Cannon wrote:

 On Tue, Jan 14, 2014 at 2:55 PM, Ethan Furman wrote:

 This PEP goes a but further than PEP 460 does, and hopefully spells
 things out in enough detail so there is no confusion as to what is
  meant.


 Are we going down the PEP route with the various ideas? Guido, do
  you want one from me as well or should I not bother?


 While I can't answer for Guido, I will say I authored this PEP because
 Antoine didn't want 460 to be any more liberal than it already was.

 If you collect your ideas together, I'll add them to 461 as questions or
 discussions or however is appropriate (assuming you're willing to go that
 route).

 --
 ~Ethan~



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Mark Lawrence


On 14/01/2014 19:55, Ethan Furman wrote:

This PEP goes a but further than PEP 460 does, and hopefully spells
things out in enough detail so there is no confusion as to what is meant.

--
~Ethan~


Out of plain old curiosity do we have to consider PEP 292 string 
templates in any way, shape or form, or regarding this debate have they 
been safely booted into touch?


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Serhiy Storchaka


15.01.14 00:40, Guido van Rossum написав(ла):

I think of PEP 460 as the strict version and PEP 461 as the lenient
version. I don't think it makes sense to have more variants. So please
collaborate with whichever you like best. :-)


Perhaps the consensus will be PEP 460.5? Or PEP 460.3, or may be PEP 460.7?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Glenn Linderman


On 1/14/2014 2:38 PM, Nick Coghlan wrote:


I think Brett has the right idea: we shouldn't try to accept numbers 
for %s in binary interpolation. If we limit it to just buffer 
exporters and objects with a __bytes__ method then the problem goes away.


The numeric codes all exist in Python 2, so the porting requirement to 
the common 2/3 subset will be to update the cases of binary 
interpolation of a number with %s to use an appropriate numeric 
formatting code instead.



+1
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-14 Thread Ethan Furman


On 01/14/2014 05:02 PM, Glenn Linderman wrote:

On 1/14/2014 2:38 PM, Nick Coghlan wrote:


I think Brett has the right idea: we shouldn't try to accept numbers
for %s in binary interpolation. If we limit it to just buffer
exporters and objects with a __bytes__ method then the problem goes away.

The numeric codes all exist in Python 2, so the porting requirement to
the common 2/3 subset will be to update the cases of binary
interpolation of a number with %s to use an appropriate numeric
formatting code instead.


+1


Agreed, PEP updated.

--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

88 matches

Mail list logo