Hello Peter,
On 2014/11/20 12:45, Peter Saint-Andre - &yet wrote:
On 11/17/14, 2:41 AM, "Martin J. Dürst" wrote:
Hello Peter,
On 2014/11/15 23:09, Peter Saint-Andre - &yet wrote:
On 11/14/14, 11:45 AM, Takahiro Nemoto wrote:
Hi, Peter-san
Thank you for your quick reply.
I think that this profile is similar to the nickname profile too.
However, even though NKFC has the benefit of increasing the
probability of comparison matching,
the aggressive normalization may create a problem. Using NKFC on some
characters may cause
changes on the character’s form and byte length that would confuse
users.
For this reason, I chose NFC and width mapping.
I don't see a compelling difference between "NFKC without width mapping"
and "NFC with width mapping", and I would strongly prefer to avoid
having two profiles here.
You got me confused here. Surely "NFKC without width mapping" and "NFC
with width mapping" differ quite a bit, namely for all compatibilitity
formatting tags (for a list, see e.g.
http://www.sw.it.aoyama.ac.jp/2013/pub/NormEquivTut/#kompabitibily_tags).
If we don't care about compatibility mappings at all, then just doing
NFC would be the simplest thing to do.
The reason why Takahiro is proposing "NFC with width mapping" is that
width differences are very well known to be ignored by users entering
text especially here is Japan. On the other hand, other compatibility
formatting categories such as <sub>, <super>, <fraction>, <circle>,
<square> and so on are visually clearly different and and therefore are
extremely rarely selected by users, if they can be input at all. For
other cases, such as <isolated>,..., these may exist in legacy
documents, but aren't produced by current input methods (read
keyboards,...).
So "NFC with width mapping" makes ample sense to me, whereas "NFKC
without width mapping" does not make any sense whatsoever in the first
place, and any profile (or whatever it's called) that used it would be
seriously flawed in my opinion and should be fixed.
Nemo is proposing a profile for nicknames (with IoT or device use cases
in mind). We already have a profile for nicknames (with chatroom or chat
use cases in mind). I am suggesting that we combine the two because
profiles must not be multiplied beyond necessity.
The point of using NFKC in draft-ietf-precis-nickname was to more
aggressively find matches and therefore reduce the possibility of
confusion. This might not apply to petnames for devices, but it
certainly applies in chatrooms (where spoofing attempts are common).
Okay. I agree that having one profile with "NFC with width mapping" and
another with NFKC is not needed, because it doesn't hurt to map other
compatibility formatting categories such as <sub>, <super>, <fraction>,
<circle>, <square> and so even where such mapping isn't really needed.
What I still don't understand is why you wrote "NFKC without width
mapping". Width mapping is an integral part of NFKC, and I don't see any
reason why it would or should be excluded.
Also, we should hear again from Takahiro, because he says that he didn't
want to use full NFKC because "using NKFC on some characters may cause
changes on the character’s form and byte length that would confuse users".
I personally think that "byte length" doesn't have anything to do with
user confusion, but e.g. mapping "①" (circled 1) or "❶" (black circled
1) to "1" could well cause user confusion. On the other hand, mapping
"⑴" or "⒈" should be preferred because otherwise they'd be easily
confused with "(1)" and "1.". So the question is mainly whether stuff
like "①" or "❶" are in actual use.
Regards, Martin.
_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis