At 16:32 06/08/16, Brian E Carpenter wrote: >You have another choice with \xFDwhich is to write it in Unicode >notation, U+00EF (LATIN SMALL LETTER I WITH DIAERESIS) if I am not >mistaken. > >Neither I nor Google knows how Unicode represents Welsh double L. >Maybe they don't?
The point is that it's just represented as two "ASCII" Ls, LL in upper case, Ll in title case, and ll in lower case. Still sorting algorithms will do the right thing (if they are told to sort for Welsh or traditional Spanish, and know how to do that, that is). Regards, Martin. >It's sorted between LY and M in the dictionary >at http://www.aber.ac.uk/~gpcwww/gpc_pdfs.htm#DANGOSEIRIAU > >(BTW llywd and lloyd are different words, but as proper names >they are confused in English spelling.) > > Brian > >Lisa Dusseault wrote: >> On Aug 15, 2006, at 3:29 PM, Spencer Dawkins wrote: >> >>>> >>>> OK, this may be inadvertantly funny - are "naive" and "Llwyd" >>>> supposed to include a non-ascii character, or is that sentence >>>> saying something else? (Welcome to the world of the RFC Editor) >>> >>> >>> I would write na闓e if I could. I assume people know that naive and >>> na闓e are both common spellings. >>> >>> Llwyd is thus spelt. The Welsh consider ll a separate letter and sort it >>> between l and m. >>> >>> Spencer-reply: I guess my point was that this was extremely subtle >>> for those of us who don't work with i18n comparison all day long. >>> Perhaps 'Welsh names such as "L1wyd", when the Welsh consider "ll" a >>> separate letter and sort it between "1" and "m"'? But you're going to have >>> to figure out how to get "na闓e" into an RFC... Perhaps your AD can step in >>> front of this speeding bullet? >> >> Not sure there's a speeding bullet here. I happened to know what Arnt >> meant by this stuff -- my own favourite example is sorting "Canada" >> followed by "canal" then "cantor", *then* followed by "ca\x81\xC2" and >> finally "ca\x81\xC2da" (last two with tildes above the n's if you can't see >> them), if you're using Spanish. >> Without the ability to put \x81or \xFDinto Internet Drafts, we could >> probably still make this more readable for people unfamiliar with i18n and >> sorting issues. Here's a stab: >> Use with natural language is often inappropriate: even though the >> collation apparently supports languages such as Italian and English, >> in real-world use it tends to mis-sort a number of types of string: >> * words such as "naive" (if spelled with an accent, the accented >> character could push the word to the wrong spot in a sorted list), >> * names such as "Llwyd" (which in Wales/Welsh or in Spanish should sort >> after single-L names like Lyza), >> * people and place names containing non-ASCII, >> * strings containing euro and pound sterling symbols, quotation >> marks, dashes/hyphens, etc. >> Lisa >> >> ------------------------------------------------------------------------ >> _______________________________________________ >> Gen-art mailing list >> [email protected] >> https://www1.ietf.org/mailman/listinfo/gen-art #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:[EMAIL PROTECTED] _______________________________________________ Gen-art mailing list [email protected] https://www1.ietf.org/mailman/listinfo/gen-art
