When an standard conformaing SQL-implementation concatenates two normalized
UCS strings, then it is required that the result be normalized (noting
Unicode Standard Annex #15 Unicode Normalization Forms, Concatenation).

My question is, supposing the NF of the two operands to be different, what
should be the NF of the result?

In its present state, our proposal specifies the result by referring to the
following table:

Table A
=======
                |Operand 2
 Operand 1      |NFKD     NFKC      NFD   NFC
 -----------------+------------------------
    NFKD        |NFKD     NFKC      NFD   NFC
    NFKC        |NFKC     NFKC      NFD   NFC
    NFD         |NFD      NFD       NFD   NFC
    NFC         |NFC      NFC       NFC   NFC

It has been suggested that the following would be preferable:


Table B
=======
                |Operand 2
 Operand 1      |NFKD     NFKC      NFD   NFC
 -----------------+------------------------
    NFKD        |NFKD     NFKC      NFKD  NFKC
    NFKC        |NFKC     NFKC      NFKD  NFKC
    NFD         |NFKD     NFKD      NFD   NFC
    NFC         |NFKC     NFKC      NFC   NFC

I have no confident opinion on this, and don't believe I could form one
without more practical experience than I'm ever likely to have. My very
tentative opinion, for what it's worth, is based on a preference for NFC
over NFKC.

Any offers?

Mike.


***********************************************************

J M Sykes              Email: [EMAIL PROTECTED]
97 Oakdale Drive
Heald Green
CHEADLE
Cheshire   SK8 3SN
UK                        Tel: (44) 161 437 5413

***********************************************************



Reply via email to