Le ven. 2 nov. 2018 à 22:27, Ken Whistler a écrit :
>
> On 11/2/2018 10:02 AM, Philippe Verdy via Unicode wrote:
>
> I was replying not about the notational repreentation of the DUCET data
> table (using [....] unnecessarily) but about the text of UTR#10 itself.
> Which remains highly
Possible new thread titles include:
Re: NFKD vs. NFLD (was Re: ...)
Re: Man's inhumanity to humane scripts (was Re: ...)
Re: Mayan and Egyptian hieroglyphs prove emoji pollute the character
encoding model (was Re: ...)
Re: Polynomials and the decline of western civilization (was Re: ...)
Le ven. 2 nov. 2018 à 22:27, Ken Whistler a écrit :
>
> On 11/2/2018 10:02 AM, Philippe Verdy via Unicode wrote:
>
> I was replying not about the notational repreentation of the DUCET data
> table (using [....] unnecessarily) but about the text of UTR#10 itself.
> Which remains highly
It should be noted that the algorithmic complexity for this NFLD
normalization ("legacy") is exactly the same as for NFKD ("compatibility").
However NFLD is versioned (like also NFLC), so NFLD can take a second
parameter: the maximum Unicode version which can be used to filter which
decomposition
When the topic being discussed no longer matches the thread title,
somebody should start a new thread with an appropriate thread title.
Le sam. 3 nov. 2018 à 23:36, Philippe Verdy a écrit :
> - this new decomposition mapping file for NFLC and NFLD, where NFLC is
>> defined to be NFC(NFLD), has some stability requirements and it must be
>> warrantied that NFD(NFLD) = NFD
>>
> Oops! fix my typo: it must be warrantied that
>
> Unlike NFKC and NFKD, the NFLC and NFLD would be an extensible superset
> based on MUTABLE character properties (this can also be "decompositions
> mappings" except that once a character is added to the new property file,
> they won't be removed, and can have some stability as well, where the
I can give other interesting examples about why the Unicode "character
encoding model" is the best option
Just consider how the Hangul alphabet is (now) encoded: its consonnant
letters are encoded "twice" (leading and trailing jamos) because they carry
semantic distinctions for efficient
As an additional remark, I find that Unicode is slowly abandoning its
initial goals of encoding texts logically and semantically. This was
contrasting to the initial ISO 106464 which wanted to produce a giant
visual encoding, based only on code charts (without any character
properties except glyph
As well the separate encoding of mathematical variants could have been
completely avoided (we know that this encoding is not sufficient, so much
that even LaTeX renderers simply don't need it or use it !).
We could have just encoded a single to use
after any base cluster, and the whole set was
Le ven. 2 nov. 2018 à 20:01, Marcel Schneider via Unicode <
unicode@unicode.org> a écrit :
> On 02/11/2018 17:45, Philippe Verdy via Unicode wrote:
> [quoted mail]
> >
> > Using variation selectors is only appropriate for these existing
> > (preencoded) superscript letters ª and º so that they
11 matches
Mail list logo