On 2/14/06, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On 2/13/06, Adam Olsen <[EMAIL PROTECTED]> wrote: > > If I understand correctly there's three main candidates: > > 1. Direct copying to str in 2.x, pretending it's latin-1 in unicode in 3.x > > I'm not sure what you mean, but I'm guessing you're thinking that the > repr() of a bytes object created from bytes('abc\xf0') would be > > bytes('abc\xf0') > > under this rule. What's so bad about that?
See below. > > 2. Direct copying to str/unicode if it's only ascii values, switching > > to a list of hex literals if there's any non-ascii values > > That works for me too. But why hex literals? As MvL stated, a list of > decimals would be just as useful. PEBKAC. Yeah, decimals are simpler and shorter even. > > 3. b"foo" literal with ascii for all ascii characters (other than \ > > and "), \xFF for individual characters that aren't ascii > > > > Given the choice I prefer the third option, with the second option as > > my runner up. The first option just screams "silent errors" to me. > > The 3rd is out of the running for many reasons. > > I'm not sure I understand your "silent errors" fear; can you elaborate? I think it's that someone will create a unicode object with real latin-1 characters and it'll get passed through without errors, the code assuming it's 8bit-as-latin-1. If they had put other unicode characters in they would have gotten an exception instead. However, at this point all the posts on latin-1 encoding/decoding have become so muddled in my mind that I don't know what they're suggesting. I think I'll wait for the pep to clear that up. -- Adam Olsen, aka Rhamphoryncus _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com