I understand what you are saying.  However, rather then bend outcome to fit
technical difficulty or complexity, I prefer to take whatever technical
effort it takes to produce the desired outcome.

⍴,'A̲'  and  ⍴,'ä'  should each produce exactly 1 regardless of the
underlying technicalities.



On Sun, Aug 16, 2015 at 5:49 AM, Elias Mårtenson <[email protected]> wrote:

> On 16 August 2015 at 18:35, Blake McBride <[email protected]> wrote:
>
>> My own opinion:
>>
>> 1.  Very strongly -  *⍴,'A̲'*  has got to equal 1 no matter what !!
>>
>
> You may think so, but if you want to be consistent on that, you would have
> to implement a completely new character set and abandon Unicode.
>
> I'll give you an example. What would you want ⍴,'ä' to be?
>
> Right now, that could return either 1 or 2 depending on whether the ä was
> using the precomposed character (U+00E4) or the combining mark (U+0061,
> U+0308). Visually, these are identical, and generally you'd expect them to
> compare equal.
>
> In Unicode, the comparison of equivalent (but with different characters)
> strings are done by performing a normalisation step prior to comparison.
> There are 4 different types of normalisation
> <http://unicode.org/reports/tr15/>, with different behaviour.
>
> Now, the ä character has a precomposed form in Unicode, and if you couple
> that with the NFC normalisation form, you'd get the above expression to
> return 1.
>
> *However,* the reason for ä working is only because there is a
> precomposed form available. The combining underline does not have that. So
> if you want to suggest that the expression applied on an underlined
> character should return 1, you *also* have to provide a suggestion as to
> what ⎕UCS X should return. Remember that ⎕UCS has to satisfy (X=⎕UCS ⎕UCS
> X).
>
> Regards,
> Elias
>

Reply via email to