M.-A. Lemburg writes: > That would be a possibility as well... but I doubt that many users > are going to bother, since slicing surrogates is just as bad as > slicing combining code points and the latter are much more common in > real life and they do happen to mostly live in the BMP.
That's only if you require 100% fidelity in the data, which may not be true in some use cases. Where 99.99% fidelity is good enough, an unexpected sliced surrogate pair is a show-stopper, while a sliced combining character sequence not only doesn't stop the show (at least in Python, and I doubt any correct Unicode process can signal a fatal error there either, I can put a tilde on a Cyrillic character if I want to, no?), it's probably readable enough that readers will assume a keypunch error. Personally, if available I would always use some such dodge in server software (I don't care enough about 24x7 availability to write it myself, though). And never in a script for interactive use; something needs fixing, may as well take the fatal error and fix it on the spot. (Again, "on the spot" for me can mean "tomorrow".) _______________________________________________ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
