Hello Peter, others, On 2012/09/21 4:31, Peter Saint-Andre wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1Section 3.1 of draft-yoneya-precis-mappings states: Width mapping will increase backward compatibility with Stringprep [RFC3454] and precis framework [I-D.ietf-precis-framework]. Because in a Stringprep profile which specifies Unicode normalization form KC (NFKC) for normalization method, fullwidth/halfwidth characters are mapped into its compatible form. If a precis framework profile specified NFKC (which is not recommended), width mapping might not be useful. Is backward compatibility the only reason to specify width mapping?
I very much don't think so. The full-width/half-width distinction that came to Unicode from East Asian Encodings is an artifact from the age of fixed character cell widths. In high-quality typesetting, there are various conventions for when to use one or the other form (and the "half-width" form is set in proportional type). But in everyday use, it's just mostly happenstance that decides which form is used.
If so, then it would be good to say that width mapping is appropriate for technologies that are migrating from stringprep (with NFKC) to precis, but not for technologies that never had a stringprep profile. If not, then it would be good to explain why it can be appropriate for a technology that uses precis (but never used stringprep) to specify width mapping. For example, perhaps width mapping helps to prevent violations of the "principle of least user surprise" because users in language communities with fullwidth and halfwidth characters might not be able to tell the difference between various widths on common devices.
"Might not be able to tell the difference" is the wrong expression. It's not too difficult to spot and identify the difference. It's much more a "not being motivated to spot the difference". An average American reader would be perfectly able to tell the difference between serif and sans-serif type. But when they read or write something, they (in most cases) just don't care. Why should they?
So (I wish it were not, but) indeed width mapping is way more important than just for backwards compatibility.
NFKC is in stringprep simply because the design team wanted to have a width mapping for IDNA 2003, and specifying NFKC was the easiest way to get there. In practice, width mapping is way more important than many of the mappings in NFC. And it's definitely the most important part of NFKC.
In my opinion, some discussion of the security implications of width mapping (e.g., the possibility of more "fase positives") would also be helpful.
I don't think there will be false positives. Saying that e.g. IBM and IBM are should be two different things is simply crazy. So it should be either width mapping, or the less basic forms (i.e. full-width for Latin and half-width for Kana,...) should be prohibited.
Regards, Martin.
Thanks! Peter - -- Peter Saint-Andre https://stpeter.im/ -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlBbbxkACgkQNL8k5A2w/vwbdwCfWoKE3UInmTbnBMG4Zbv9gHfr JTAAoJBXvP/FsphnbmPXxjnYeYKu204f =i6jZ -----END PGP SIGNATURE----- _______________________________________________ precis mailing list [email protected] https://www.ietf.org/mailman/listinfo/precis
_______________________________________________ precis mailing list [email protected] https://www.ietf.org/mailman/listinfo/precis
