dc-rda  

Re: MARC and Unicode normalization forms

Simon Spero
Wed, 18 Mar 2009 11:20:54 -0700

Karen-
If the character isn't composable, NFC includes it decomposed (in canonical order)

Simon

Sent from my iPhone

On Mar 18, 2009, at 1:56 PM, Karen Coyle <kco...@kcoyle.net> wrote:

Simon, yes, I kind of assumed that C is considered the default, especially given that java language routines warn about it in their own default way. However, I heard from a Unicode org staff member (no longer there) that the technical committees were warming to D- composed because of the greater flexibility. Now, the big question is: does RDA itself have any preference? I suspect not, so we're back to library practice, and the fact that library transliteration has created some characters that can only be re-created in Unicode using D. Personally, I think that it might be necessary for libraries to re-examine why they are needing characters that no one else uses, and whether those should inform the library data future... but I think that's a discussion that we'll have later as we get further into data development.

kc

Simon Spero wrote:
On Tue, Mar 17, 2009 at 7:38 PM, Karen Coyle <kco...@kcoyle.net> wrote:


Rebecca,

Thank you so much for looking into this. As I understand the Unicode normal forms, it's not that one of them is more "correct" than others, it's a matter of circumstance and your particular needs. It does look like it would be good for program developers to document what form their program outputs
in an effort to "save the time of the user."



The W3C recommends NFC: http://www.w3.org/TR/charmod-norm/#sec-ChoiceNFC

As does "HOWTO Avoid Being Called a Bozo When Producing
XML<http://hsivonen.iki.fi/producing-xml/#nfc>"
(cited from: http://www.ibm.com/developerworks/xml/library/x-think35.html )



--
-----------------------------------
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------