Emoji map of Colorado

2020-04-01 Thread Karl Williamson via Unicode
https://www.reddit.com/r/Denver/comments/fsmn87/quarantine_boredom_my_emoji_map_of_colorado/?mc_cid=365e908e08_eid=0700c8706b

EGYPTIAN HIEROGLYPH MAN WITH A ROLL OF TOILET PAPER

2020-03-11 Thread Karl Williamson via Unicode
On 2/12/20 11:12 AM, Frédéric Grosshans via Unicode wrote: Dear Unicode list members (CC Michel Suignard),   the Unicode proposal L2/20-068 , “Revised draft for the encoding of an extended Egyptian Hieroglyphs repertoire,

Re: Call for feedback on UTS #18: Unicode Regular Expressions

2020-01-02 Thread Karl Williamson via Unicode
One thing I noticed in reviewing this is the removal of text about loose matching of the name property. But I didn't see an explanation for this removal. Please point me to the explanation, or tell me what it is. Specifically these lines were removed: As with other property values, names

Re: Missing UAX#31 tests?

2018-07-14 Thread Karl Williamson via Unicode
On 07/09/2018 02:11 PM, Karl Williamson via Unicode wrote: On 07/08/2018 03:21 AM, Mark Davis ☕️ wrote: I'm surprised that the tests for 11.0 passed for a 10.0 implementation, because the following should have triggered a difference for WB. Can you check on this particular case? ÷ 0020

Re: Missing UAX#31 tests?

2018-07-09 Thread Karl Williamson via Unicode
, and I should not expect a more complete series than you furnished. Mark // On Sun, Jul 8, 2018 at 6:52 AM, Karl Williamson via Unicode mailto:unicode@unicode.org>> wrote: I am working on upgrading from Unicode 10 to Unicode 11. I used all the new files. The algo

Re: Missing UAX#31 tests?

2018-07-08 Thread Karl Williamson via Unicode
On 07/08/2018 03:23 AM, Mark Davis ☕️ wrote: PS, although the title was "Missing UAX#31 tests?", I assumed you were talking about http://unicode.org/reports/tr29/ Yes, sorry.

Missing UAX#31 tests?

2018-07-07 Thread Karl Williamson via Unicode
I am working on upgrading from Unicode 10 to Unicode 11. I used all the new files. The algorithms for some of the boundaries, like GCB and WB, have changed so that some of the property values no longer have code points associated with them. I ran the tests furnished in 11.0 for these

Traditional and Simplified Han in UTS 39

2017-12-27 Thread Karl Williamson via Unicode
In UTS 39, it says, that optionally, "Mark Chinese strings as “mixed script” if they contain both simplified (S) and traditional (T) Chinese characters, using the Unihan data in the Unicode Character Database [UCD]. "The criterion can only be applied if the language of the string is known

Inconsistency between UTS 39 and 24

2017-12-21 Thread Karl Williamson via Unicode
In http://unicode.org/reports/tr39/#Mixed_Script_Detection it says, "For more information on the Script_Extensions property and Jpan, Kore, and Hanb, see UAX #24" In http://www.unicode.org/reports/tr24/, there certainly is more information on scx; however, none of the terms Jpan Kore nor Hanb

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-30 Thread Karl Williamson via Unicode
Under Best Practices, how many REPLACEMENT CHARACTERs should the sequence generate? 0, 1, 2, 3, 4 ? In practice, how many do parsers generate?

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-30 Thread Karl Williamson via Unicode
On 05/30/2017 02:30 PM, Doug Ewell via Unicode wrote: L2/17-168 says: "For UTF-8, recommend evaluating maximal subsequences based on the original structural definition of UTF-8, without ever restricting trail bytes to less than 80..BF. For example: is a single maximal subsequence because C0

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Karl Williamson via Unicode
On 05/26/2017 12:22 PM, Ken Whistler wrote: On 5/26/2017 10:28 AM, Karl Williamson via Unicode wrote: The link provided about the PRI doesn't lead to the comments. PRI #121 (August, 2008) pre-dated the practice of keeping all the feedback comments together with the PRI itself in a numbered

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-26 Thread Karl Williamson via Unicode
On 05/26/2017 04:28 AM, Martin J. Dürst wrote: It may be worth to think about whether the Unicode standard should mention implementations like yours. But there should be no doubt about the fact that the PRI and Unicode 5.2 (and the current version of Unicode) are clear about what they

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-24 Thread Karl Williamson via Unicode
On 05/24/2017 12:46 AM, Martin J. Dürst wrote: On 2017/05/24 05:57, Karl Williamson via Unicode wrote: On 05/23/2017 12:20 PM, Asmus Freytag (c) via Unicode wrote: Adding a "recommendation" this late in the game is just bad standards policy. Unless I misunderstand, you a

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-23 Thread Karl Williamson via Unicode
On 05/23/2017 12:20 PM, Asmus Freytag (c) via Unicode wrote: On 5/23/2017 10:45 AM, Markus Scherer wrote: On Tue, May 23, 2017 at 7:05 AM, Asmus Freytag via Unicode > wrote: So, if the proposal for Unicode really was more of a "feels right"

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-15 Thread Karl Williamson via Unicode
On 05/15/2017 04:21 AM, Henri Sivonen via Unicode wrote: In reference to: http://www.unicode.org/L2/L2017/17168-utf-8-recommend.pdf I think Unicode should not adopt the proposed change. The proposal is to make ICU's spec violation conforming. I think there is both a technical and a political