On 8/27/07, Barry Warsaw <[EMAIL PROTECTED]> wrote: > On Aug 27, 2007, at 4:38 PM, Guido van Rossum wrote: > > > I'm still working on stricter enforcement of the "don't mix str and > > bytes" rule. I'm finding a lot of trivial problems, which are > > relatively easy to fix but time-consuming. > > > > While doing this, I realize there are two idioms for converting a str > > to bytes: s.encode(e) or bytes(s, e). These have identical results. I > > think we can't really drop s.encode(), for symmetry with b.decode(). > > So is bytes(s, e) redundant? > > I think it might be. I've hit this several time while working on the > email package and it's certainly confusing. I've also run into > situations where I did not like the default e=utf-8 argument for bytes > (). Sometimes I am able to work around failures by doing this: "bytes > (ord(c) for c in s)" until I found "bytes(s, 'raw-unicode-escape')" > > I'm probably doing something really dumb to need that, but it does > get me farther along. I do intend to go back and look at those > (there are only a few) when I get the rest of the package working again. > > Getting back to the original question, I'd like to see "bytes(s, e)" > dropped in favor of "s.encode(e)" and maayyybeee (he says bracing for > the shout down) "bytes(s)" to be defined as "bytes(s, 'raw-unicode- > escape')".
I see a consensus developing for dropping bytes(s, e). Start avoiding it like the plague now to help reduce the work needed once it's actually gone. But I don't see the point of defaulting to raw-unicode-escape -- what's the use case for that? I think you should just explicitly say s.encode('raw-unicode-escape') where you need that. Any reason you can't? -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com