On 6/4/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > It seems to me the simplest thing to do is to require that Python > > source files be normalized. Then the ambiguity just goes away. > > Everyone knows what form their files should be in, and if you really > > need to construct a non-normalized string, you can do that explicitly > > using "\u" notation.
> However, what would that mean wrt. non-Unicode source encodings. > Say you have a Latin-1-encoded source code. Is that in NFC or not? Doesn't that depend on whether they happened to ever write some of the combined characters (such as ö) using a two-character form like o¨? FWIW, I would prefer "the parser will normalize" to "the parser will reject unnormalized", to support even the dumbest of editors. -jJ _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com