Hi, IETF precis WG (CC'ed) is discussing comparison of internationalized strings. Would you please see if precis framework (work on progress) is applicable? <http://datatracker.ietf.org/doc/draft-ietf-precis-framework/>
Regards, -- Yoshiro YONEYA <[email protected]> On Wed, 25 Jan 2012 15:35:39 +0100 Alexander Mayrhofer <[email protected]> wrote: > Based on the discussions during last week's Design Team call, i have > drafted a new revision of the "Internationalization" Text - see below. I > still haven't found a proper headline for the "comparison & case > sensititivity" section, suggestions welcome. > > Thanks to Ken for the valuable comments on my initial text - very > useful. > > Generally, comments on the text are very much appreciated. > > -- snip -- > > X. Internationalization > ------------------------ > > X.1. Character Encoding > ---------------------------- > > If elements contain non-ASCII characters, the UTF-8 [RFC3629] character > encoding MUST be used. If the transport protocol allows for explicit > declaration of the character encoding used, clients and servers SHOULD > explicitely declare the content to be UTF-8, but MAY in turn assume that > content that is received without such declaration is using UTF-8. > > X.2. Language identification > --------------------------------- > > Where messages in human-readable languages are used in the protocol, > those messages SHOULD be tagged according to RFC 4646, and the transport > protocol MUST support a respective mechanism to transmit such tags > together with those human readable messages. If absent, the language of > the message defaults to "en" (English). > > X.3. Time zones > ------------------- > > Date and time attribute values MUST be represented in Universal > Coordinated Time (UTC) using the Gregorian calendar. The actual > date/time format > used MUST be specified by the respective transport protocol. > > Y. Comparison Operations > ----------------------------- > > The protocol contains two types of elements. The first type, called > "blobs" for the purpose of this document, contains data that is moved by > means of that protocol, but is neither used to identifier objects, or to > refer between objects. The other type is called "identifier", and is > either used to address an object, or to refer between objects. > > The following elements are "Identifiers": > > * rant > * rar > * dgName > * rrName > * rgName > * egrRteName > > All other elements contained in the protocol fall under the "blobs" > category. For "identifiers", the following rules apply: > > * Identifiers are case insensitive, and MUST use Unicode Normalization > Form C (NFC) in comparison and mapping operations > ..* Characters allowed in Identifierst MUST consist of Characters > exclusively from those character classes: TODO > * The Registry MUST store such identifiers in their lower case > mapping, and MUST correctly map upper case representations of that > identifier to their lower case equivalent in comparison operations. > ..* Clients SHOULD lowercase such identifiers before sending them to the > Registry > > If "blob" type elements are ever to be compared, byte-wise comparison > MUST be used. > _______________________________________________ > drinks mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/drinks > > _______________________________________________ precis mailing list [email protected] https://www.ietf.org/mailman/listinfo/precis
