RE: Emoji map of Colorado

2020-04-02 Thread Doug Ewell via Unicode
Karl Williamson shared: > https://www.reddit.com/r/Denver/comments/fsmn87/quarantine_boredom_my_emoji_map_of_colorado/?mc_cid=365e908e08_eid=0700c8706b It's too bad this was only made available as an image, not as text, which of course it is. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Doug Ewell via Unicode
0, there > were a lot of them not in Unicode. I'd forgotten that there were still about two dozen GB18030 characters mapped, more or less officially, into the Unicode PUA. But again, I changed the subject. Sorry about that. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Doug Ewell via Unicode
space considered appropriate for that in the meantime? -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Doug Ewell via Unicode
Adam Borowski wrote: > Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF > or U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²), which has > its uses but is not well-formed Unicode. I'd be interested in your elaboration on what these uses are. -- Doug Ewell |

RE: Why do binary files contain text but text files don't contain binary?

2020-02-21 Thread Doug Ewell via Unicode
c way. If any of them has that structure interrupted by random bytes, the format has been broken and the file is corrupt. It is no different for text data, which is expected to contain certain bytes and is normally not expected to be interrupted by a series of ranëH‰UÀHƒÈÿH Does that help? --Doug Ewell | Thornton, CO, US | ewellic.org 

Re: Will TAGALOG LETTER RA, currently in the pipeline, be in the next version of Unicode?

2019-10-11 Thread Doug Ewell via Unicode
ry. Great, that means I'll be able to start using and exchanging them in March, when Unicode 12.1 is released, right? Uh, no: 1. What Ken said above. 2. Unicode 12.1 was always just about the Reiwa sign. 3. Even when 13 comes out, fonts won't be immediately and magically updated to include them.

Re: On the lack of a SQUARE TB glyph

2019-09-30 Thread Doug Ewell via Unicode
eard, anyway. Just out of curiosity, does anyone have actual examples of such applications? This might help demonstrate why the Reiwa sign doesn't set a precedent for TB et al. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: On the lack of a SQUARE TB glyph

2019-09-29 Thread Doug Ewell via Unicode
t could not simply use the two existing characters 令和 for Reiwa. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: On the lack of a SQUARE TB glyph

2019-09-26 Thread Doug Ewell via Unicode
te for a proposal. UTC won't add a character based on mailing-list chat, of course; they'll need a proper proposal. They'll also be the ones to decide what code point is assigned, although the proposal can politely suggest one. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: PUA (BMP) planned characters HTML tables

2019-08-21 Thread Doug Ewell via Unicode
em to be non-controversial. So to reiterate, these characters appear vanishingly unlikely to be atomically encoded, "yet" or ever, for good reason. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Numeric group separators and Bidi

2019-07-15 Thread Doug Ewell via Unicode
character sets (which I think is what Philippe is referring to as "proposed groupings"). -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Unicode "no-op" Character?

2019-07-04 Thread Doug Ewell via Unicode
that. While the Unix/C "end of string" convention was not the only case in which NUL was hijacked, it is certainly the best-known, and the greatest impediment to any current attempt to use it with its original meaning. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Unicode "no-op" Character?

2019-06-22 Thread Doug Ewell via Unicode
ot;display a .notdef glyph" is one of the popular choices. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Proposal to extend the U+1F4A9 Symbol

2019-06-01 Thread Doug Ewell via Unicode
Andrew West wrote: > oh, there is no Wikidata QID for phone dropped in the toilet. It's Wikidata, right? Pretty much anyone can create an item for pretty much anything, right? Problem solved. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Proposal to extend the U+1F4A9 Symbol

2019-06-01 Thread Doug Ewell via Unicode
f mind of the phone's owner, and there are none for the brand and model of phone and toilet. So the sequence above is clearly inadequate for people to express themselves. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Proposal to extend the U+1F4A9 Symbol

2019-06-01 Thread Doug Ewell via Unicode
rds the medical > profession. If physicians and other medical professionals are relying on emoji, in any way and at any time, to determine diagnosis and treatment, the state of health care is much worse than I thought. -- Doug Ewell | Thornton, CO, US | ewellic.org

Format A

2019-05-30 Thread Doug Ewell via Unicode
ething truly similar to and/or derivative of Format A.) Please reply on-list only if you think the list at large would benefit from your reply. I'm hoping some of the Unicode elders might have some insight here. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Symbols of colors used in Portugal for transport

2019-04-29 Thread Doug Ewell via Unicode
ule about "established, not ephemeral" would still apply. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Symbols of colors used in Portugal for transport

2019-04-29 Thread Doug Ewell via Unicode
ges; but neither of those is what Unicode is about. For non-emoji characters, there is usually still a requirement to show a certain level of actual usage. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Is ARMENIAN ABBREVIATION MARK (՟, U+055F) misclassified?

2019-04-26 Thread Doug Ewell via Unicode
ht, and the fonts are wrong. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Unicode CLDR 35 alpha available for testing

2019-02-28 Thread Doug Ewell via Unicode
announcements at unicode.org wrote: > The alpha version of Unicode CLDR 35 > <http://cldr.unicode.org/index/downloads/cldr-35> is available for > testing. No downloadable data files in the sense of released builds, correct? -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic

2019-02-10 Thread Doug Ewell via Unicode
that these do NOT nest (no stack...), just state changes for the > relevant PART of the "graphic" (i.e. style) state. So the approach in > that regard is quite different from the approach done in HTML/CSS. I don't regard that as either a bug or a feature. I certainly don't expect that every such mechanism has to nest, simply because SGML and its descendants are designed that way. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic

2019-02-08 Thread Doug Ewell via Unicode
convention will ever be. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Does "endian-ness" apply to UTF-8 characters that use multiple bytes?

2019-02-04 Thread Doug Ewell via Unicode
http://www.unicode.org/faq/utf_bom.html#utf8-2 -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Proposal for BiDi in terminal emulators

2019-02-02 Thread Doug Ewell via Unicode
ed plane or the Plane That Shall Not Be Mentioned. "Deprecated" is a term of art in Unicode. -- Doug Ewell | Thornton, CO, US | ewellic.org

Use of tag characters in emoji sequences (was: Re: Proposal for BiDi in terminal emulators)

2019-02-02 Thread Doug Ewell via Unicode
s contain non-ASCII characters...) None of these are part of Andrew's mechanism. It's just b, i, u, and s. > is not standard Neither Andrew nor anyone else claimed it was. > (it's just an experiment in one font), It applies to any TrueType font, because the rendering engine can apply these four styles (in any combination) to any TrueType font. > and would in fact not be compatible with the existing specification > for tags. Good thing nobody claimed they were. > So only E+E0020 through U+E0040, and U+E005B through U+E007E remain > deprecated. Da capo. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Proposal for BiDi in terminal emulators

2019-02-01 Thread Doug Ewell via Unicode
. Only U+E0001 LANGUAGE TAG and U+E007F CANCEL TAG are deprecated. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Encoding italic

2019-01-31 Thread Doug Ewell via Unicode
ould > be of interest for the generalised subject of this thread. I'm hoping we can continue to restrict this thread to plain text. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Proposal for BiDi in terminal emulators

2019-01-31 Thread Doug Ewell via Unicode
he use of Mathematical Alphanumeric Symbols: they look tempting and are (usually) easy to render, but among other things, they only cover [A-Za-zıȷΑ-Ωα-ω] and thus miss much of the text that may need to be italicized. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic

2019-01-30 Thread Doug Ewell via Unicode
s not, why we should not simply refer to the more familiar 6429? -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic

2019-01-30 Thread Doug Ewell via Unicode
why SCSU had to be banished to the hut, right around the same time the Plane 14 language tags were deprecated. In SCSU, astral characters can be 1 byte just like BMP characters. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic

2019-01-29 Thread Doug Ewell via Unicode
Unicode. But these are NOT the same idea, and the fact that they both use Plane 14 tag characters doesn't make them so. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic

2019-01-29 Thread Doug Ewell via Unicode
cause it's abuse, doesn't cover other writing systems, etc. I'd be happy to work with Kent to campaign for ISO 6429 as "the" well-established standard for applying simple styling to plain text, but we would have to acknowledge the significant challenges. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic

2019-01-29 Thread Doug Ewell via Unicode
at was? I don't have time to go through scores of messages, and there is no search facility. I can't speak for Andrew, but I strongly suspect he implemented this as a proof of concept, not to declare himself the Maker of Standards. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Unihan variants information

2019-01-28 Thread Doug Ewell via Unicode
/www.unicode.org/copyright.html, and 2. send a quick note to the Consortium officers asking whether they are OK with this use of the Unicode name. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic (was: A last missing link)

2019-01-21 Thread Doug Ewell via Unicode
ke this to implement features like background and foreground colors, inverse video, and more, which are not available as plain-text characters. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Encoding italic

2019-01-21 Thread Doug Ewell via Unicode
, definitely not to this list, since the digest will clobber such characters (quod vide). -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Where is my character @?

2019-01-09 Thread Doug Ewell via Unicode
pread within the Koalib community. (And no, this does not constitute "disdain for the small community.") -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: The encoding of the Welsh flag

2018-11-22 Thread Doug Ewell via Unicode
either the Ulster Banner or St. Patrick's Saltire. This situation is described, and explicitly so for the UM flags, in Annex B of UTS #51 under "Caveats." -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: The encoding of the Welsh flag

2018-11-22 Thread Doug Ewell via Unicode
lid or non-existent. I would certainly like to use the flag of Colorado, whose visual appearance is very much standardized, but the vicious circle of vendor support and UTS #51 categorization means no system will offer glyph support, and some systems may even reject it as invalid. --

Re: Encoding (was: Re: A sign/abbreviation for "magister")

2018-11-05 Thread Doug Ewell via Unicode
you want to propose something, you should consider writing a proposal. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: A sign/abbreviation for "magister"

2018-11-02 Thread Doug Ewell via Unicode
but it does not > argue for a new SEVEN WITH STROKE character or that I should use Ƶ > rather than Z when I write *Ƶanƶibar. http://www.unicode.org/L2/L2018/18323-open-four.pdf -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: A sign/abbreviation for "magister"

2018-11-02 Thread Doug Ewell via Unicode
Do we have any other evidence of this usage, besides a single handwritten postcard? -- Doug Ewell | Thornton, CO, US | ewellic.org

[getting OT] Re: A sign/abbreviation for "magister"

2018-10-30 Thread Doug Ewell via Unicode
U+FC63 (presentation forms). Arabic presentation forms are never an example of anything, and their use is full of caveats. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: A sign/abbreviation for "magister"

2018-10-30 Thread Doug Ewell via Unicode
Web page, it could easily. The article "English numerals" does include a bullet point: "The suffixes -th, -st, -nd and -rd are occasionally written superscript above the number itself." Note the word "occasionally." -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: A sign/abbreviation for "magister"

2018-10-29 Thread Doug Ewell via Unicode
ging U+02B3 or U+036C into the discussion just fuels the recurring demands for every Latin letter (and eventually those in other scripts) to be duplicated in subscript and superscript, à la L2/18-206. Back into my hole now. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Base64 encoding applied to different unicode texts always yields different base64 texts ... true or false?

2018-10-14 Thread Doug Ewell via Unicode
e64 encoding," he was asking about the basic definition of base64. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Base64 encoding applied to different unicode texts always yields different base64 texts ... true or false?

2018-10-12 Thread Doug Ewell via Unicode
ed on this a little bit in UTN #14, from the standpoint of trying to improve compression by normalizing the Unicode text first. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: EOL conventions (was: Re: UCD in XML or in CSV? (is: UCD

2018-09-08 Thread Doug Ewell via Unicode
by introducing LS and PS, but we know how that went.) 3. Unicode data files can be read and processed on any platform, but some careful choice of reading and processing tools might be advisable. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: UCD in XML or in CSV? (is: UCD in YAML)

2018-09-06 Thread Doug Ewell via Unicode
orrect pagination." which similarly assumes that "users of Microsoft Windows" have only Notepad at their disposal. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Unicode Digest, Vol 56, Issue 20

2018-08-30 Thread Doug Ewell via Unicode
these two alternatives. --Doug Ewell | Thornton, CO, US | ewellic.org Original message Message: 3Date: Thu, 30 Aug 2018 02:27:33 +0200 (CEST) From: Marcel Schneider via Unicode Curiously, UnicodeData.txt is lacking the header line. That makes it unflexible. I never wondered why

Re: Private Use areas

2018-08-28 Thread Doug Ewell via Unicode
On August 23, 2011, Asmus Freytag wrote: > On 8/23/2011 7:22 AM, Doug Ewell wrote: >> Of all applications, a word processor or DTP application would want >> to know more about the properties of characters than just whether >> they are RTL. Line breaking, word breaking,

Re: Private Use areas

2018-08-21 Thread Doug Ewell via Unicode
thout step 1. I'd gladly participate in such a project. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Private Use areas (was: Re: Thoughts on working with the Emoji Subcommittee (was ...))

2018-08-20 Thread Doug Ewell via Unicode
. I have anecdotes, if anyone is interested off-list. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Unicode, emoji and Sundar Pichai

2018-07-13 Thread Doug Ewell via Unicode
anization. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Italic mu in squared Latin abbreviations?

2018-06-20 Thread Doug Ewell via Unicode
of character identity, but Arial Unicode MS has not been updated since 2000 and this problem is likely to remain unsolved. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Hyphenation Markup

2018-06-02 Thread Doug Ewell via Unicode
General Categories? For example, an 'Mc' followed by ZWSP followed by an 'Lo' displays like such-and-so. The code points would be best. Incidentally, does CLDR define the rendering of soft hyphen, or is one entirely at the mercy of the application? Why would this be a CLDR thing? -- Doug Ewell

Re: Why is TAMIL SIGN VIRAMA (pulli) not Alphabetic?

2018-05-29 Thread Doug Ewell via Unicode
y to Tamil, of course." In any case, Ken has answered the real underlying question: a process that checks whether each character in a sequence is "alphabetic" is inappropriate for determining whether the sequence constitutes a word. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Why is TAMIL SIGN VIRAMA (pulli) not Alphabetic?

2018-05-28 Thread Doug Ewell via Unicode
to Tamil, of course. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: L2/18-181

2018-05-17 Thread Doug Ewell via Unicode
uage has its own language-specific alphabet. It is the same for Bengali and Assamese, although the language-specific subsets are called abugidas instead of alphabets. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: L2/18-181

2018-05-17 Thread Doug Ewell via Unicode
I wrote: > ক্ is a conjunct consisting of three code points s/ক্/ক্ষ/ -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: L2/18-181

2018-05-17 Thread Doug Ewell via Unicode
byte assignment in ISCII. Disunifying Assamese from Bengali in Unicode would have a much greater impact. -- Doug Ewell | Thornton, CO, US | ewellic.org

L2/18-181

2018-05-16 Thread Doug Ewell via Unicode
, German, Spanish, and hundreds of other languages written in the Latin script, if the Assamese proposal is approved we can expect similar disunification of the Latin script into language-specific alphabets in the future. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode

2018-04-02 Thread Doug Ewell via Unicode
points?" I did appreciate the Acknowledgements section which lists the members of ABBA as a source of inspiration. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: base1024 encoding using Unicode emojis

2018-03-11 Thread Doug Ewell via Unicode
Oh, let him have a little fun. At least he's using emoji for something related to characters, instead of playing Mr. Potato Head. Incidentally, more prior art on large-base encoding: https://sites.google.com/site/markusicu/unicode/base16k -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Unicode Emoji 11.0 characters now ready for adoption!

2018-03-01 Thread Doug Ewell via Unicode
for improvement". I think that is a measurement of locale coverage -- whether the collation tables and translations of "a.m." and "p.m." and "a week ago Thursday" are correct and verified -- not character coverage. -- Doug Ewell | Thornton, CO, US | ewellic.org

Missing Kazakh Latin letters (was: Re: 0027, 02BC, 2019, or a new character?)

2018-02-27 Thread Doug Ewell via Unicode
ot; project. So either the UDHR translation is wildly incorrect, which seems unlikely, or the transliteration tables are incomplete. Wikipedia shows digraphs Iý ıý for Ю ю, and Ia ıa for Я я, and nothing for the others, though it is not clear where the digraphs came from, and of course the usual Wiki

Re: Unicode of Death 2.0

2018-02-17 Thread Doug Ewell via Unicode
out how a writing system used by 78 million people works. -- Doug Ewell | Thornton, CO, US | ewellic.org

+1 (was: Re: Why so much emoji nonsense?)

2018-02-15 Thread Doug Ewell via Unicode
specially popular in the IETF. It is not intended for situations that require explanation or details. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Keyboard layouts and CLDR

2018-01-30 Thread Doug Ewell via Unicode
tinKeyboard.zip -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Keyboard layouts and CLDR

2018-01-30 Thread Doug Ewell via Unicode
ative. That too. Good point. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-29 Thread Doug Ewell via Unicode
(b) it doesn't ship with Windows Of course that is not a "luxury." Knowing that third-party options are available, let alone free and easily installed ones, is the luxury. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-29 Thread Doug Ewell via Unicode
y, but you'd be surprised. > To like a particular layout does not mean to want to stick with it > when anything better comes up. Userʼs choice is always respected. See above regarding what users might like if only they had a choice. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-28 Thread Doug Ewell via Unicode
and syntax to better "support keyboard layouts from all major providers." Please point me to the part I missed. -- Doug Ewell | Thornton, CO, US | ewellic.org

Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-28 Thread Doug Ewell via Unicode
vendors have released. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: 0027, 02BC, 2019, or a new character?

2018-01-25 Thread Doug Ewell via Unicode
uot; will not save > anything ; but the regular Unicode apostrophe U+2019 would need... 3 > bytes after the 1-byte basic Latin letter from ASCII (so it is > worse !). I did not see any evidence that this was something they ever considered or cared about. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: 0027, 02BC, 2019, or a new character?

2018-01-25 Thread Doug Ewell via Unicode
Kazakhstan. Most of the participants in this "apostrophe" thread appeared to be from North America and Western Europe; I think you're the only one who expanded that. I wasn't referring to the geographical or cultural makeup of the list as a whole. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: 0027, 02BC, 2019, or a new character?

2018-01-24 Thread Doug Ewell via Unicode
t be talking about AutoCorrect on Microsoft Word. Just visit AutoCorrect Options and turn off that particular "replace as you type" option, and be done with it. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: 0027, 02BC, 2019, or a new character?

2018-01-23 Thread Doug Ewell via Unicode
vens and the earth in only 6 days was that there was no installed base to worry about. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: 0027, 02BC, 2019, or a new character?

2018-01-23 Thread Doug Ewell via Unicode
ndard keyboard," meaning an English-language one. Nazarbayev may ultimately be persuaded to embrace ASCII digraphs, which also meet this goal, but this talk about U+2019 and U+02BC will make exactly zero difference in Kazakh policy. -- Doug Ewell | Thornton, CO, US | ewellic.org

SignWriting in U+40000 block

2018-01-22 Thread Doug Ewell via Unicode
aegis) will ever use it). Nevertheless, I wonder if it would be appropriate for Unicode or WG2, in some capacity, to protest in some formal way against this recommendation to arrogate an unassigned plane instead of using the PUA, which is the correct place for unassigned characters. -- Doug Ewell

Re: Non-RGI sequences are not emoji? (was: Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10))

2018-01-15 Thread Doug Ewell via Unicode
On January 5, Mark Davis wrote: Doug, I modified my working draft, at https://docs.google.com/document/d/1EuNjbs0XrBwqlvCJxra44o3EVrwdBJUWsPf8Ec1fWKY If that looks ok, I'll submit. Sorry for the delay. The text substitutions look fine. -- Doug Ewell | Thornton, CO, US | ewellic.org

Non-RGI sequences are not emoji? (was: Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10))

2018-01-02 Thread Doug Ewell via Unicode
or all CLDR subdivisions, not just three, with the understanding that the vast majority would not be supported by vendor glyphs. II t is unfortunate that, while the conciliatory name "recommended" was adopted for the three, the intent of "exclusively permitted" was retained. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Linearized tilde?

2017-12-30 Thread Doug Ewell via Unicode
ntion with no basis in history or usage. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Linearized tilde?

2017-12-30 Thread Doug Ewell via Unicode
s" that show up in proposals from time to time, but have never been used except by their inventors and to talk about them. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Split a UTF-8 multi-octet sequence such that it cannot be unambiguously restored?

2017-07-24 Thread Doug Ewell via Unicode
U+10, four bytes) for almost fourteen years now. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Split a UTF-8 multi-octet sequence such that it cannot be unambiguously restored?

2017-07-24 Thread Doug Ewell via Unicode
he character encoding of the data, and would not split multi-byte sequences in that encoding to begin with. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: First bonafide use (≠ mention) of emoji by an academic publisher?

2017-07-23 Thread Doug Ewell via Unicode
other character sets, or indexed in search engines. Font Awesome also includes some symbols that, we think, won't ever be Unicode emoji, such as the Android, Apple, Bluetooth, and Windows logos. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Unicode education in UK Schools

2017-07-07 Thread Doug Ewell via Unicode
esses, much more so than with other technical topics. This scares newbies and they walk away thinking every aspect of Unicode is complex and weird. -- Doug Ewell | Thornton, CO, US | ewellic.org

Unicode education in the professional world

2017-07-07 Thread Doug Ewell via Unicode
(*255)+131. and: > While UTF8 uses only 2 bytes to store data AL32UTF8 uses 2 or 4 bytes. Unicode and UTF-8 have been around a long time by now. The fact that there is still fake news like this out there, steering our less Unicode-aware colleagues waaay down the wrong path, is disconcerting.

Re: LATIN CAPITAL LETTER SHARP S officially recognized

2017-07-03 Thread Doug Ewell via Unicode
ithin the same font. I thought that was one of the main reasons we had Unicode: so we would no longer have to rely on particular fonts, or magic font behavior, to get character identities we expected and could interchange reliably. -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: 10.0 Code Charts

2017-06-22 Thread Doug Ewell via Unicode
Michael Bear wrote: > When are the code charts (http://www.unicode.org/charts/) going to be > updated for Unicode 10.0? They look fine to me. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Looking for 8-bit computer designers

2017-06-14 Thread Doug Ewell via Unicode
cipher characters, and we were looking for insight from the original folks who worked on them. We have no shortage of present-day expertise. -- Doug Ewell | Thornton, CO, US | ewellic.org

RE: Running out of code points, redux (was: Re: Feedback on the proposal...)

2017-06-05 Thread Doug Ewell via Unicode
Martin J. Dürst wrote: > Assuming (conservatively) that it will take about a century to fill up > all 17 (well, actually 15, because two are private) planes, this would > give us another century. Current estimates seem to indicate that 800 years is closer to the mark. -- Doug Ewell |

Re: Encoding of character for new Japanese era name after Heisei

2017-06-02 Thread Doug Ewell via Unicode
graphs. A new square compatibility character, if necessary, can be encoded after the era name is chosen. It might be fast-tracked at that time, as the Euro sign was, but there is no emergency about this and no reason to invent any new encoding procedures or waive any existing ones. -- Doug Ewell | Thornton, CO, US | ewellic.org

Running out of code points, redux (was: Re: Feedback on the proposal...)

2017-06-01 Thread Doug Ewell via Unicode
Richard Wordingham wrote: > even supporting 6-byte patterns just in case 20.1 bits eventually turn > out not to be enough, Oh, gosh, here we go with this. What will we do if 31 bits turn out not to be enough? -- Doug Ewell | Thornton, CO, US | ewellic.org

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-31 Thread Doug Ewell via Unicode
ho will determine that the judge doesn't have a conflict? An alternative would be to require that proposals, once received with whatever amount of research, are augmented with any necessary additional research *before* being approved. The identity or reputation of the requester should be irrelev

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-30 Thread Doug Ewell via Unicode
That's not at all the same as saying it was a valid sequence. That's saying decoders were allowed to be lenient with invalid sequences. We're supposed to be comfortable with standards language here. Do we really not understand this distinction? --Doug Ewell | Thornton, CO, US | ewellic.org

RE: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-30 Thread Doug Ewell via Unicode
uot; When was it ever true that C0 was a valid lead byte? And what does that have to do with (not) restricting trail bytes? -- Doug Ewell | Thornton, CO, US | ewellic.org

Looking for 8-bit computer designers

2017-05-30 Thread Doug Ewell via Unicode
of anonymity and confidentiality will be honored. -- Doug Ewell | Thornton, CO, US | ewellic.org

  1   2   3   4   5   6   7   8   9   10   >