On 9/7/06, David Hopwood <[EMAIL PROTECTED]> wrote: > Yes. However, this is not a good idea for precisely the reason described > on that page (false detection of Unicode), and so any Unicode detection > algorithm in Python should only be based on detecting a BOM, IMHO.
Right, except BOMs break tons of Unix applications (and even occasional Windows ones) which do not expect them. Which leaves us with Python nearly unable to detect unicode on Unix. This is quite unfortunate for those of us rooting for UTF-8. Perhaps there are better heuristics that are worth considering. Perhaps not. It certainly shouldn't be the default behaviour of a TextFile constructor. Michael -- Michael Urman http://www.tortall.net/mu/blog _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com