On 6/5/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > I'd love to get rid of full-width ASCII and halfwidth kana (via > > compatibility decomposition). Native Japanese speakers often use them > > interchangably with the "proper" versions when correcting typos and > > updating numbers in a series. Ugly, to say the least. I don't think > > that native Japanese would care, as long as the decomposition is done > > internally to Python.
> Not sure what the proposal is here. If people say "we want the PEP do > NFKC", I understand that as "instead of saying NFC, it should say > NFKC", which in turn means "all identifiers are converted into the > normal form NFKC while parsing". I would prefer that. > With that change, the full-width ASCII characters would still be > allowed in source - they just wouldn't be different from the regular > ones anymore when comparing identifiers. I *think* that would be OK; so long as they mean the same thing, it is just a quirk like using a different font. I am slightly concerned that it might mean "string as string" and "string as identifier" have different tests for equality. > Another option would be to require that the source is in NFKC already, > where I then ask again what precisely that means in presence of > non-UTF source encodings. My own opinion is that it would be reasonable to put those in NFKC form as part of the parser's internal translation to unicode. (But I agree that it makes sense to do that for all encodings, if it is done for any.) -jJ _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com