On 2/13/06, Neil Schemenauer <[EMAIL PROTECTED]> wrote: > Guido van Rossum <[EMAIL PROTECTED]> wrote: > >> In py3k, when the str object is eliminated, then what do you have? > >> Perhaps > >> - bytes("\x80"), you get an error, encoding is required. There is no > >> such thing as "default encoding" anymore, as there's no str object. > >> - bytes("\x80", encoding="latin-1"), you get a bytestring with a > >> single byte of value 0x80. > > > > Yes to both again. > > I haven't been following this dicussion about bytes() real closely > but I don't think that bytes() should do the encoding. We already > have a way to spell that: > > "\x80".encode('latin-1')
But in 2.5 we can't change that to return a bytes object without creating HUGE incompatibilities. In general I've come to appreciate that there are two ways of converting an object of type A to an object of type B: ask an A instance to convert itself to a B, or ask the type B to create a new instance from an A. Depending on what A and B are, both APIs make sense; sometimes reasons of decoupling require that A can't know about B, in which case you have to use the latter approach; sometimes B can't know about A, in which case you have to use the former. Even when A == B we sometimes support both APIs: to create a new list from a list a, you can write a[:] or list(a); to create a new dict from a dict d, you can write d.copy() or dict(d). An advantage of the latter API is that there's no confusion about the resulting type -- dict(d) is definitely a dict, and list(a) is definitely a list. Not so for d.copy() or a[:] -- if the input type is another mapping or sequence, it'll probably return an object of that same type. Again, it depends on the application which is better. I think that bytes(s, <encoding>) is fine, especially for expressing a new type, since it is unambiguous about the result type, and has no backwards compatibility issues. > Also, I think it would useful to introduce byte array literals at > the same time as the bytes object. That would allow people to use > byte arrays without having to get involved with all the silly string > encoding confusion. You missed the part where I said that introducing the bytes type *without* a literal seems to be a good first step. A new type, even built-in, is much less drastic than a new literal (which requires lexer and parser support in addition to everything else). -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com