On 1/13/2014 1:40 PM, Brett Cannon wrote:

> So bytes formatting really needn't (and shouldn't, IMO) mirror str
> formatting.

This was my presumption in writing byteformat().

I think one of the things about Guido's proposal that bugs me is that it
breaks the mental model of the .format() method from str in terms of how
the mini-language works. For str.format() you have the conversion and
the format spec (e.g. "{!r}" and "{:d}", respectively). You apply the
conversion by calling the appropriate built-in, e.g. 'r' calls repr().
The format spec semantically gets passed with the object to format()
which calls the object's __format__() method: ``format(number, 'd')``.

Now Guido's suggestion has two parts that affect the mini-language for
.format(). One is that for bytes.format() the default conversion is
bytes() instead of str(), which is fine (probably want to add 'b' as a
conversion value as well to be consistent). But the other bit is that
the format spec goes from semantically meaning ``format(thing,
format_spec)`` to ``format(thing, format_spec).encode('ascii',
'strict')`` for at least numbers. That implicitness bugs me as I have
always thought of format specs just leading to a call to format(). I
think I can live with it, though, as long as it is **consistently**
applied across the board for bytes.format(); every use of a format spec
leads to calling ``format(thing, format_spec).encode('ascii',
'strict')`` no matter what type 'thing' would be and it is clearly
documented that this is done to ease porting and handle the common case
then I can live with it.

This is how my byteformat function works, except that when no format_spec is given, byte and bytearrary objects are left unchanged rather than being decoded and encoded again.

This even gives people in-place ASCII encoding for strings by always
using '{:s}' with text which they can do when they port their code to
run under both Python 2 and 3. So you should be able to do
``b'Content-Type: {:s}'.format('image/jpeg')`` and have it give ASCII.
If you want more explicit encoding to latin-1 then you need to do it
explicitly and not rely on the mini-language to do tricks for you.

IOW I want to treat the format mini-language as a language and thus not
have any special-casing or massive shifts in meaning between
str.format() and bytes.format() so my mental model doesn't have to
contort based on whether it's str or bytes. My preference is not have
any, but if Guido is going say PBP here then I want absolute consistency
across the board in how bytes.format() tweaks things.

As for %s for the % operator calling ascii(), I think that will be a
porting nightmare of finding out why your bytes suddenly stopped being
formatted properly and then having to crawl through all of your code for
that one use of %s which is getting bytes in. By raising a TypeError you
will very easily detect where your screw-up occurred thanks to the
traceback; do so otherwise feels too much like implicit type conversion
and ask any JavaScript developer how that can be a bad thing.

I personally would not add 'bytes % whatever'.

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to