Re: unicode Digest V12 #108

2011-07-06 Thread Asmus Freytag
On 7/3/2011 6:31 AM, Philippe Verdy wrote: Regarfing the previous comment about the Danish aa, Sorry, most of that discussion missed the mark. Modern Danish can have AA for two reasons. Accidental occurrence, as in dataanalyse which is composed of two words which just happens to put two A

Re: unicode Digest V12 #108

2011-07-06 Thread Jukka K. Korpela
2011-07-06 9:25, Asmus Freytag wrote: Because accidental digraphs (in Danish) happen at word boundaries in a compound, the SHY is an elegant way to mark them. It may often be a practical trick, given the current repertoire of characters in Unicode and the way they are handled in different

Re: SHY, CGJ, etc.

2011-07-06 Thread Andreas Prilop
On Tue, 5 Jul 2011, Philippe Verdy wrote: Even MS Word 2010 continues to use U+001F as soft hyphen but does not recognize U+00AD as soft hyphen. I've not spoken at all about U+001F and not even tested it alt+0031 alt+0173 I have entered TRUE soft hyphens as U+00AD, in a plain-text

Re: unicode Digest V12 #108

2011-07-06 Thread Asmus Freytag
On 7/6/2011 12:16 AM, Jukka K. Korpela wrote: Allowing word division just to say that some characters do not constitute a digraph (or trigraph…) is not practical e.g. when the text has otherwise no word divisions, for one reason or another, or when the particular word division point is

Re: unicode Digest V12 #108

2011-07-06 Thread Ken Whistler
On 7/6/2011 11:18 AM, Asmus Freytag wrote: The Danes, over a decade ago, when they made the official recommendation to use SHY appear to have come to the conclusion that AA can never occur accidentally, except at word division in compounds. Not really a safe conclusion. :)

Re: Questions about UAX #29

2011-07-06 Thread Mark Davis ☕
I wouldn't be adverse to adding [:cn:][:cs:][:co:] to [:gcb:control:]. It would make it align more with the current definition of Grapheme_Base. As to how to handle private use characters, UAX #29 already allows overriding: This specification defines *default* mechanisms; more sophisticated

Re: What are the issues in having U+FB06 fold to U+FB05?

2011-07-06 Thread Mark Davis ☕
Mark *— Il meglio è l’inimico del bene —* On Sat, Jun 11, 2011 at 08:04, Karl Williamson pub...@khwilliamson.comwrote: On 06/08/2011 03:33 PM, Mark Davis ☕ wrote: As to the first, it would seem reasonable. The simple folding is not covered by the following stability policies:

Re: What are the issues in having U+FB06 fold to U+FB05?

2011-07-06 Thread Ken Whistler
On 7/6/2011 1:40 PM, Mark Davis ☕ wrote: The other two are special cases; they casefold together because of the way that the full case mapping is computed. Their equivalence is normally captured by a canonical-equivalent folding. Because the simple