On Sun, Oct 3, 2010 at 9:00 AM, R. David Murray <rdmur...@bitdance.com> wrote: > I do not propose that this is a *good* API, since it has the classic > problem that if there are coding bugs in the email module strings may > "escape" that have surrogates in them and we end up with programs that > work most of the time....except when they fail with mysterious errors > because of unusual bytes input data. On the other hand you always > *know* when you have bytes data in an unknown encoding (because they > are surrogate escaped), so it is ever so much better than the Python2 > situation.
It's a similar concept to one Antoine and I (and some others) have been considering in the tracker for making urllib.parse able to handle ASCII-compatible bytes-encodings. I've already implemented a version of that patch which has parallel bytes and str versions of all the ASCII constants, and the result is pretty ugly. My next goal is to implement a version that uses the same trick you have here for email and see how the code complexity compares. We do need to tread carefully to make sure the pseudo strings don't escape, but the other approach requires similar care all the way through the internal algorithms to make sure they aren't assuming bytes or str instances anywhere. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com