On Mon, Jan 13, 2014 at 12:42 PM, R. David Murray <rdmur...@bitdance.com> wrote: > On Mon, 13 Jan 2014 12:41:18 +0100, Antoine Pitrou <solip...@pitrou.net> > wrote: >> On Sun, 12 Jan 2014 18:11:47 -0800 >> Guido van Rossum <gu...@python.org> wrote: >> > On Sun, Jan 12, 2014 at 5:27 PM, Ethan Furman <et...@stoneleaf.us> wrote: >> > > On 01/12/2014 04:47 PM, Guido van Rossum wrote: >> > >> %s seems the trickiest: I think with a bytes argument it should just >> > >> insert those bytes (and the padding modifiers should work too), and >> > >> for other types it should probably work like %a, so that it works as >> > >> expected for numeric values, and with a string argument it will return >> > >> the ascii()-variant of its repr(). Examples: >> > >> >> > >> b'%s' % 42 == b'42' >> > >> b'%s' % 'x' == b"'x'" (i.e. the three-byte string containing an 'x' >> > >> enclosed in single quotes) >> > > >> > > I'm not sure about the quotes. Would anyone ever actually want those in >> > > the >> > > byte stream? >> > >> > Perhaps not, but it's a hint that you should probably think about an >> > encoding. It's symmetric with how '%s' % b'x' returns "b'x'". Think of >> > it as payback time. :-) >> >> What is the use case for embedding a quoted ASCII-encoded representation >> in a byte stream? > > There is no use case in the sense you are asking, just like there is no > real use case for '%s' % b'x' producing "b'x'". But the real use case > is exactly the same: to let you know your code is screwed up without > actually blowing up with a encoding Exception. > > For the record, I like Guido's logic and proposal. I don't understand > Nick's objection, since I don't see the difference between the situation > here where a string gets interpolated into bytes as 'xxx' and the > corresponding situation where bytes gets interpolated into a string > as b'xxx'. Why struggle to keep bytes interpolation "pure" if string > interpolation isn't? > > Guido's proposal makes the language more symmetric, and thus more > consistent and less surprising. Exactly the hallmarks of Python's design > sense, IMO. (Big surprise, right? :) > > Of course, this point of view *is* based on the idea that when you are > doing interpolation using %/.format, you are in fact primarily concerned > with ASCII compatible byte streams. This is a Practicality sort of > argument. It is, after all, by far the most common use case when > doing interpolation[*]. > > If you wanted to do a purist version of this symmetry, you'd have bytes(x) > calling __bytes__ if it was defined and falling back to calling a > __brepr__ otherwise. > > But what would __brepr__ implement? The variety of format codes in > the struct module argues that there is no "one obvious" binary > repr for most types. (Those that have one would implement __bytes__). > And what would be the __brepr__ of an arbitrary 'object'? > > Faced with the impracticality of defining __brepr__ usefully in any "pure > bytes" form, it seems sensible to admit that the most useful __brepr__ > is the ascii() encoding of the __repr__. Which naturally produces 'xxx' > as the __brepr__ of a string. > > This does cause things to get a little un-pretty when you are operating > at the python prompt: > > >>> b'%s' % object > b'"<class \\\'object\\\'>"' > > But then again that is most likely really not what you mean to do, so > it becomes a big red flag...just like b'xxx' is a small red flag when > you accidentally interpolate unencoded bytes into a string. > > --David > > PS: When I first read Guido's remark that the result of interpolating a > string should be 'xxx', I went Wah? I had to reason my way through to > it as above, but to him it was just the natural answer. Guido isn't > always right, but this kind of automatic language design consistency > is one reason he's the BDFL. > > [*] I still think that you mostly want to design your library so that > you are handling the text parts as text and the bytes parts as bytes, > and encoding/gluing them as appropriate at the IO boundary. But if Guido > says his real code would benefit by being able to interpolate ASCII into > bytes at certain points, I'll believe him.
<elided rant/> If you think corrupted data is easier or more pleasant to track down than encoding exceptions then I think you are strange. It makes porting really difficult while you are still trying to figure out where the bytes/str boundaries are. I am now deeply suspicious of all % formatting. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com