On Wed, Dec 14, 2011 at 3:27 AM, Aaron Bentley <aa...@canonical.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11-12-13 01:26 AM, Robert Collins wrote: >> We could say 'utf8' and leave it at that. Or we could say 'the >> printable subset of ascii' or some such. I'd just say non >> whitespace utf8, as strings are easier to deal with, and avoiding >> whitespace avoids most likely encoding issues. > > If we're using Unicode identifiers, do we need to specify a > normalization form (NFC or NFD), or would we impose that on the data > after receiving it?
http://www.w3.org/TR/charmod-norm/ - NFC. Remember that they are opaque, so no service ever has a reason to renormalise an identifier. -Rob _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : launchpad-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp