Re: comma ellipses

2019-10-07 Thread David Starner via Unicode
I still see the encoding of the original ellipsis as a mistake, probably for compatibility with some older standard that included it because the system wasn't smart enough to intelligently handle "..." as ellipsis. -- Kie ekzistas vivo, ekzistas espero.

Re: On the lack of a SQUARE TB glyph

2019-09-27 Thread David Starner via Unicode
On Thu, Sep 26, 2019 at 8:57 PM Fred Brennan via Unicode wrote: > The purpose of Unicode is plaintext encoding, is it not? The square TB form is > fundamentally no different than the square form of Reiwa, U+32FF ㋿, which was > added in a hurry. The difference is that SQUARE TB's necessity and use

Re: Unicode "no-op" Character?

2019-06-24 Thread David Starner via Unicode
On Sun, Jun 23, 2019 at 10:41 PM Shawn Steele via Unicode wrote: > Which leads us to the key. The desire is for a character that has no public > meaning, but has some sort of private meaning. In other words it has a > private use. Oddly enough, there is a group of characters intended for >

Re: Encoding italic

2019-02-09 Thread David Starner via Unicode
On Sat, Feb 9, 2019 at 3:59 AM Kent Karlsson via Unicode < unicode@unicode.org> wrote: > > Den 2019-02-08 21:53, skrev "Doug Ewell via Unicode" >: > > • Reverse on: ESC [7m > > • Reverse off: ESC [27m > > "Reverse" = "switch background and foreground colours". > > This is an (odd) colour thing.

Re: Encoding italic

2019-01-31 Thread David Starner via Unicode
On Thu, Jan 31, 2019 at 12:56 AM Tex wrote: > > David, > > "italics has never been considered part of plain text and has always been > considered outside of plain text. " > > Time to change the definition if that is what is holding you back. That's not a definition; that's a fact. Again, it's

Re: Encoding italic

2019-01-31 Thread David Starner via Unicode
On Wed, Jan 30, 2019 at 11:37 PM James Kass via Unicode wrote: > As Tex Texin observed, differences of opinion as to where we draw the > line between text and mark-up are somewhat ideological. If a compelling > case for handling italics at the plain-text level can be made, then the > fact that

Re: Encoding italic

2019-01-30 Thread David Starner via Unicode
On Sun, Jan 27, 2019 at 12:04 PM James Kass via Unicode wrote: > A new beta of BabelPad has been released which enables input, storing, > and display of italics, bold, strikethrough, and underline in plain-text Okay? Ed can do that too, along with nano and notepad. It's called HTML (TeX, Troff).

Re: Encoding italic

2019-01-25 Thread David Starner via Unicode
On Thu, Jan 24, 2019 at 11:16 PM Tex via Unicode wrote: > Twitter was offered as an example, not the only example just one of the most > ubiquitous. Many messaging apps and other apps would benefit from italics. > The argument is not based on adding italics to twitter. And again, color me

Re: Encoding italic

2019-01-23 Thread David Starner via Unicode
On Tue, Jan 22, 2019 at 4:18 PM Richard Wordingham via Unicode wrote: > On Mon, 21 Jan 2019 00:29:42 -0800 > David Starner via Unicode wrote: > > > The superscripts show a problem with multiple encoding; even if you > > think they should be Unicode superscripts, and t

Re: Encoding italic

2019-01-21 Thread David Starner via Unicode
On Sun, Jan 20, 2019 at 11:53 PM James Kass via Unicode wrote: > Even though /we/ know how to do > it and have software installed to help us do it. You're emailing from Gmail, which has support for italics in email. The world has, in general, solved this problem. > > How do you envision this

Re: Encoding italic (was: A last missing link)

2019-01-20 Thread David Starner via Unicode
On Sun, Jan 20, 2019 at 2:57 PM James Kass via Unicode wrote: > At which time it would only become a moot point for Twitter users. > There's also Facebook and other on-line groups. Plus scholars and > linguists. And interoperability. > How do you envision this working? In practice, English is

Re: Encoding italic (was: A last missing link)

2019-01-16 Thread David Starner via Unicode
On Wed, Jan 16, 2019 at 7:41 PM James Kass via Unicode wrote: > Computer text tradition aside, nobody seems to offer any legitimate > reason why such information isn't worthy of being preservable in > plain-text. Perhaps there isn't one. > Worthy of being preservable? Again, if you want rich

Re: Encoding italic

2019-01-16 Thread David Starner via Unicode
On Tue, Jan 15, 2019 at 10:19 PM James Kass via Unicode wrote: > Would there be any advantages to rich-text apps if italics were added to > Unicode? Is there any cost/benefit data? You've made an assertion > about complication to rich-text apps which I can neither confirm nor refute. It's

Re: Encoding italic

2019-01-15 Thread David Starner via Unicode
On Tue, Jan 15, 2019 at 5:17 PM James Kass via Unicode wrote: > Enabling plain-text doesn't make rich-text poor. > Adding italics to Unicode will complicate the implementation of all rich text applications that currently support italics. > People who regard plain-text with derision, disdain,

Re: Encoding italic (was: A last missing link)

2019-01-15 Thread David Starner via Unicode
On Tue, Jan 15, 2019 at 1:47 PM James Kass via Unicode wrote: > > Although there probably isn't really any concerted effort to "keep > plain-text mediocre", it can sometimes seem that way. > Dennis Ritchie allegedly replied to requests for new features in C with “If you want PL/I, you know

Re: A last missing link for interoperable representation

2019-01-14 Thread David Starner via Unicode
On Mon, Jan 14, 2019 at 5:58 PM Mark E. Shoulson via Unicode wrote: > *If* the VS is ignored by searches, as apparently it should be and some > have reported that it is, then VS-type solutions would NOT be a problem > when it comes to searches Who is using VS-type solutions? I could not enter

Re: A last missing link for interoperable representation

2019-01-14 Thread David Starner via Unicode
On Mon, Jan 14, 2019 at 2:09 AM Tex via Unicode wrote: > The arguments against italics seem to be: > > ·Unicode is plain text. Italics is rich text. > > ·We haven't had it until now, so we don't need it. > > ·There are many rich text solutions, such as html. > > ·

Re: A last missing link for interoperable representation

2019-01-13 Thread David Starner via Unicode
On Sun, Jan 13, 2019 at 7:03 PM Martin J. Dürst via Unicode wrote: > No, the casing idea isn't actually a dumb one. As Asmus has shown, one > of the best ways to understand what Unicode does with respect to text > variants is that style works on spans of characters (words,...), and is > rich

Re: A last missing link for interoperable representation

2019-01-13 Thread David Starner via Unicode
On Sat, Jan 12, 2019 at 8:26 PM James Kass via Unicode wrote: > It's subjective, really. It depends on how one views plain-text and > one's expectations for its future. Should plain-text be progressive, > regressive, or stagnant? Because those are really the only choices. > And opinions

Re: A last missing link for interoperable representation

2019-01-11 Thread David Starner via Unicode
Emoji were being encoded as characters, as codepoints in private use areas. That inherently called for a Unicode response. Bidirectional support is a headache; the amount of confusion and outright exploits from them is way higher then we like.The HTML support probably doesn't help that. However,

Re: A last missing link for interoperable representation

2019-01-09 Thread David Starner via Unicode
On Tue, Jan 8, 2019 at 11:58 PM James Kass via Unicode wrote: > > David Starner wrote, > > > Can some books be mostly handled with Unicode plain text > > and italics? Sure. HTML can handle them quite nicely. ... > > Yes, many books can be handled very well with

Re: A last missing link for interoperable representation

2019-01-08 Thread David Starner via Unicode
On Tue, Jan 8, 2019 at 2:03 AM James Kass via Unicode wrote: > The boundaries of plain text have advanced since the concept originated > and will probably continue to do so. Stress can currently be > represented in plain text with conventions used in lieu of existing > typographic practice.

Re: Suggestions?

2018-02-21 Thread David Starner via Unicode
On Wed, Feb 21, 2018 at 7:55 AM Jeb Eldridge via Unicode < unicode@unicode.org> wrote: > Where can I post suggestions and feedback for Unicode? > Here is as good as any place. There are specific places for a few specific things, but likely if you do have something that's likely to get changed,

Re: 0027, 02BC, 2019, or a new character?

2018-02-21 Thread David Starner via Unicode
On Wed, Feb 21, 2018 at 9:40 AM John W Kennedy via Unicode < unicode@unicode.org> wrote: > “Curmudgeonly” is a perfectly good English word attested back to 1590. > Curmudgeony may be identified as misspelled by Google, but it's got a bit of usage dating back a hundred years. Wiktionary's entry

Re: metric for block coverage

2018-02-18 Thread David Starner via Unicode
On Sun, Feb 18, 2018 at 3:42 AM Adam Borowski wrote: > I probably used a bad example: scripts like Cyrillic (not even Supplement) > include both essential letters and those which are historic only or used by > old folks in a language spoken by 1000, who use Russian (or

Re: metric for block coverage

2018-02-17 Thread David Starner via Unicode
On Sat, Feb 17, 2018 at 3:30 PM Adam Borowski via Unicode < unicode@unicode.org> wrote: > þ or ą count the same as LATIN TURNED CAPITAL LETTER SAMPI WITH HORNS AND TAIL WITH SMALL LETTER X WITH CARON. þ is in Latin-1, and ą is in Latin-A; the first is essential, even in its marginal characters,

Re: Why so much emoji nonsense?

2018-02-14 Thread David Starner via Unicode
On Wed, Feb 14, 2018 at 2:35 PM James Kass via Unicode <unicode@unicode.org> wrote: > David Starner wrote, > > > They were characters being interchanged as text > > in current use. > > They were in-line graphics being interchanged as though they were > text.

Re: Why so much emoji nonsense?

2018-02-14 Thread David Starner via Unicode
On Wed, Feb 14, 2018 at 11:16 AM James Kass via Unicode wrote: > That's one way of looking at it. Another way would be that the emoji > were definitely outside the scope of the Unicode project as encoding > them violated Unicode's initial encoding principles. > They were

Re: Why so much emoji nonsense?

2018-02-14 Thread David Starner via Unicode
On Wed, Feb 14, 2018 at 12:55 AM Erik Pedersen via Unicode < unicode@unicode.org> wrote: > Dear Unicode Digest list members, > > Emoji, in my opinion, are almost entirely outside the scope of the Unicode > project. Unlike text composed of the world’s traditional alphabetic, > syllabic, abugida or

Re: Keyboard layouts and CLDR (was: Re: 0027, 02BC, 2019, or a new character?)

2018-01-30 Thread David Starner via Unicode
On Tue, Jan 30, 2018 at 2:23 AM Alastair Houghton via Unicode < unicode@unicode.org> wrote: > This pattern exists across the board at the two companies; the Windows API > hasn’t changed all that much since Windows NT 4/95, whereas Apple has > basically thrown away all the work it did up to Mac OS

Re: 0027, 02BC, 2019, or a new character?

2018-01-24 Thread David Starner via Unicode
On Wed, Jan 24, 2018 at 6:31 PM Shriramana Sharma via Unicode < unicode@unicode.org> wrote: > > On 23-Jan-2018 10:03, "James Kass via Unicode" > wrote: > > (bottle, east, skier, crucial, cherry) > s'i's'a, s'yg'ys, s'an'g'ys'y, s'es'u's'i, s'i'i'e > sxixsxa, sxygxys,

Re: 0027, 02BC, 2019, or a new character?

2018-01-23 Thread David Starner via Unicode
On Tue, Jan 23, 2018 at 10:55 AM Doug Ewell via Unicode wrote: > I think it's so cute that some of us think we can advise Nazarbayev on > whether to use straight or curly apostrophes or accents or x's or > whatever. Like he would listen to a bunch of Western technocrats. >

Linearized tilde?

2017-12-29 Thread David Starner via Unicode
https://en.wikipedia.org/wiki/African_reference_alphabet says "The 1982 revision of the alphabet was made by Michael Mann and David Dalby, who had attended the Niamey conference. It has 60 letters; some are quite different from the 1978 version." and offers the linearized tilde, a tilde squeezed

Fwd: Unicode education in Schools

2017-08-24 Thread David Starner via Unicode
-- Forwarded message - From: David Starner <prosfil...@gmail.com> Date: Thu, Aug 24, 2017, 6:16 PM Subject: Re: Unicode education in Schools To: Richard Wordingham <richard.wording...@ntlworld.com> On Thu, Aug 24, 2017, 5:26 PM Richard Wordingham via Unico

Re: Running out of code points, redux (was: Re: Feedback on the proposal...)

2017-06-04 Thread David Starner via Unicode
On Sun, Jun 4, 2017 at 9:13 PM Martin J. Dürst via Unicode < unicode@unicode.org> wrote: > Sorry to be late with this, but if 20.1 bits turn out to not be enough, > what about 21 bits? > > That would still limit UTF-8 to four bytes, but would almost double the > code space. Assuming

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-16 Thread David Starner via Unicode
On Tue, May 16, 2017 at 1:45 AM Alastair Houghton < alast...@alastairs-place.net> wrote: > That’s true anyway; imagine the database holds raw bytes, that just happen > to decode to U+FFFD. There might seem to be *two* names that both contain > U+FFFD in the same place. How do you distinguish

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-16 Thread David Starner via Unicode
On Tue, May 16, 2017 at 12:42 AM Alastair Houghton < alast...@alastairs-place.net> wrote: > If you’re about to mutter something about security, consider this: > security code *should* refuse to compare strings that contain U+FFFD (or at > least should never treat them as equal, even to

Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-15 Thread David Starner via Unicode
On Mon, May 15, 2017 at 8:41 AM Alastair Houghton via Unicode < unicode@unicode.org> wrote: > Yes, UTF-8 is more efficient for primarily ASCII text, but that is not the > case for other situations UTF-8 is clearly more efficient space-wise that includes more ASCII characters than characters

Re: abstract characters, semantics, meaningful transformations ... Was: Tibetan Paluta

2017-05-01 Thread David Starner via Unicode
On Mon, May 1, 2017 at 7:26 AM Naena Guru via Unicode wrote: > This whole attempt to make digitizing Indic script some esoteric, > 'abstract', 'semantic representation' and so on seems to me is an attempt > to make Unicode the realm of the some super humans. > Unicode is

Re: Standaridized variation sequences for the Desert alphabet?

2017-04-06 Thread David Starner
On Thu, Apr 6, 2017 at 12:07 AM Martin J. Dürst wrote: > And while we currently have no evidence that Deseret had developed a > typographic tradition where some type styles would use one set of > ligatures, and other styles would use another set, it wouldn't be > possible

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread David Starner
On Mon, Mar 27, 2017 at 1:34 AM Martin J. Dürst wrote: > The qualification 'minor' is less important for an alphabet. In general, > the more established and well-known an alphabet is, the wider the > variations of glyph shapes that may be tolerated. > My problem with

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-26 Thread David Starner
On Sun, Mar 26, 2017 at 6:12 AM Michael Everson <ever...@evertype.com> wrote: > On 25 Mar 2017, at 22:15, David Starner <prosfil...@gmail.com> wrote: > > > > And I'd argue that a good theoretical model of the Latin script makes ä, > ꞛ and aͤ the same character,

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-25 Thread David Starner
On Fri, Mar 24, 2017 at 9:17 AM Michael Everson wrote: > And we *can* distinguish i and j in that Latin text, because we have > separate characters encoded for it. And we *have* encoded many other Latin > ligature-based letters and sigla of various kinds for the

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-23 Thread David Starner
On Thu, Mar 23, 2017 at 6:54 AM Michael Everson wrote: > Again: The source of 1855 EW and OI uses *different* letters than the 1859 > EW and OI do. This wasn’t accidental. It’s not hard to puzzle out or to > see. This isn’t random or even systematic natural development of >

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-23 Thread David Starner
On Wed, Mar 22, 2017 at 5:09 PM Michael Everson <ever...@evertype.com> wrote: > On 22 Mar 2017, at 21:39, David Starner <prosfil...@gmail.com> wrote: > > > > Does "Яussia" require a new Latin letter because the way R was written > has a different origin tha

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-22 Thread David Starner
On Wed, Mar 22, 2017 at 8:54 AM Michael Everson wrote: > If there is evidence outside of the Wikipedia for the 1859 letters, they > should be encoded as new letters, because their design shows them to be > ligatures of different base characters. That means they’re not glyph

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-21 Thread David Starner
On Tue, Mar 21, 2017 at 4:50 PM James Kass wrote: > If the user community needs to preserve the distinction in plain-text, > then variation selection is the right approach. > True. However, the user community is tiny, and I suspect that those variation selectors would

Re: The (Klingon) Empire Strikes Back

2016-11-06 Thread David Starner
On Fri, Nov 4, 2016 at 10:42 AM David Faulks wrote: > There is another issue of course, which I think could be a huge obstacle: > the Trademark/Copyright issue. Paramount claims copyright over the entire > Klingon language (presumably including the script). The issue has

Re: Emoji end goal

2016-10-12 Thread David Starner
On Wed, Oct 12, 2016 at 11:48 AM Rebecca T <637...@gmail.com> wrote: > Agreed. I think a good response to “that’d _double_ the codepoints, so we > should just add a ligature” is “if it would be such a burden to implement > that you don’t want to use space in the charts for what are,

Re: Wogb3 j3k3: Pre-Unicode substitutions for extended characters live on

2016-10-11 Thread David Starner
On Tue, Oct 11, 2016 at 8:55 AM wrote: > What is current thinking / practice wrt expanding virtual keyboards? > I'm just a user here, and that of the English and Esperanto keyboards on Android, but given swipe input and autocorrect both depending on knowing what language is

Re: Bit arithmetic on Unicode characters?

2016-10-09 Thread David Starner
On Sun, Oct 9, 2016 at 4:03 AM Mark Davis ☕️ wrote: > Essentially all of the game pieces that are in Unicode were added for > compatibility with existing character sets. ​I'm guessing that ​there are > hundreds to thousands of possible other symbols associated with games in >

Re: Noto unified font

2016-10-09 Thread David Starner
On Sat, Oct 8, 2016 at 11:07 PM James Kass wrote: > The word "free" when applied to any product means "free of charge". > Using the word "product" sort of biases your argument, does it not? "Freeware" appears to be a contraction of "free software". If so, the > two

Re: Default character encoding for each operating system?

2016-09-15 Thread David Starner
Linux is far less specific than Windows 10. In all recent versions of Debian GNU/Linux, UTF-8 is the most common character encoding, but it is still supported to use ISO-8859-x or I believe even something like EUC-JP. Other distributions may enforce UTF-8 or in rare cases ISO 8859-1 or even

Mammal emoji

2016-03-06 Thread David Starner
Seeing the presence of foxes on the upcoming emoji list, I remembered the Audubon Mammals (North America) app has silhouettes of mammals on the browse by shape tab. So let's see if they're covered: Armored Mammals (-): Okay, we're off to a bad start. The image here is sort of porcupine-ish, and

Re: Encoding/Use of pontial unpaired UTF-16 surrogate pair specifiers

2016-01-30 Thread David Starner
Obfuscate is right. It might conceivably be better than nothing, but at its best it will stop someone for an hour or so. Why not run it through a standard encryption protocol and if necessary use one of the options mentioned before to turn it into valid text? On Sat, Jan 30, 2016, 6:31 PM J

Re: Counting Codepoints

2015-10-13 Thread David Starner
On Mon, Oct 12, 2015 at 11:42 PM Richard Wordingham < richard.wording...@ntlworld.com> wrote: > On Mon, 12 Oct 2015 23:35:32 +0000 > David Starner <prosfil...@gmail.com> wrote: > > > Thus a Unicode string simply can't be in UTF-16 format > > internally with unpa

Re: Counting Codepoints

2015-10-12 Thread David Starner
Any system that exposes Unicode strings (not UTF-16 strings) cannot have two surrogates merge when two strings are appended. There's nothing in the Unicode standard that says that should happen for a string in an arbitrary format, and it's unreasonable behavior for a string. Thus a Unicode string

Re: APL Under-bar Characters

2015-08-17 Thread David Starner
On Mon, Aug 17, 2015 at 8:03 PM alexwei...@alexweiner.com wrote: Pierpaolo, You make a very good observation. You are essentially asking the question that began the whole discussion. This is covered in depth in the gnuapl mailing list. You can go their archive, and just search my name :)

Re: APL Under-bar Characters

2015-08-16 Thread David Starner
Original Message Subject: Re: APL Under-bar Characters From: David Starner prosfil...@gmail.com Date: Sun, August 16, 2015 5:59 pm To: alexwei...@alexweiner.com, Ken Whistler kenwhist...@att.net Cc: unicode@unicode.org The standard is set here. The Unicode Consortium has

Re: APL Under-bar Characters

2015-08-16 Thread David Starner
The standard is set here. The Unicode Consortium has declared that it won't encode precomposed characters that can be created from characters in the standard, because that would be destabilizing and potentially introduce security holes in programs depending on Unicode. If you want, we can have a

Re: Security concerns: OGHAM SPACE MARK

2015-07-21 Thread David Starner
On Tue, Jul 21, 2015 at 2:55 PM Dreiheller, Albrecht albrecht.dreihel...@siemens.com wrote: My concern is not about the Ogham space, but about the free usage of non-Ascii in programming languages in general. Just imagine, when you decide to open a door for public traffic in busy city with a

Re: Security concerns: OGHAM SPACE MARK

2015-07-21 Thread David Starner
On Tue, Jul 21, 2015 at 2:14 AM Dreiheller, Albrecht albrecht.dreihel...@siemens.com wrote: If the author really intends to deceive potential readers he will succeed. Possibly. Code is hard. But the Ogham space is not a real threat; it's easy to search for and obviously a deliberate attempt

Re: Security concerns: OGHAM SPACE MARK

2015-07-20 Thread David Starner
It's a confusable. There's a lot of them in Unicode. Auditing source code is hard, and if it's a concern, I suggest filtering out all non-ASCII characters. If you really think it's a concern, let's be specific; what do you mean this kind of behavior in bank transactions? If you're worried about

Re: Another take on the English apostrophe in Unicode

2015-06-05 Thread David Starner
On June 4, 2015, at 11:01 PM, Leo Broukhis l...@mailcom.com wrote: On Thu, Jun 4, 2015 at 9:25 PM, David Starner prosfil...@gmail.com wrote: Hyphens generally make multiple words into one anyway. There's not really multiple hyphens the way there's separate quotes and apostrophes. Generally

Re: Another take on the English apostrophe in Unicode

2015-06-05 Thread David Starner
On Fri, Jun 5, 2015 at 12:16 AM Leo Broukhis l...@mailcom.com wrote: I agree that conflating apostrophes and quotes is a source of problems, however, existence of the MODIFIER LETTER [same glyph as used for English contractions] in Unicode is a coincidence which should not have an effect on

Re: Another take on the English apostrophe in Unicode

2015-06-05 Thread David Starner
On Fri, Jun 5, 2015 at 2:43 AM QSJN 4 UKR qsjn4...@gmail.com wrote: The conflict is between linguists and programmers. No, it's not. Yes it is ambiguous! It is. It just is! Linguists say It is. We see that. We know that. Now you programmers find some way to deal with that so you can

Re: Another take on the English apostrophe in Unicode

2015-06-04 Thread David Starner
On Thu, Jun 4, 2015 at 2:38 PM Markus Scherer markus@gmail.com wrote: don’t is a contraction of two words, it is not one word. But as he points out, it's not a contraction of don and t; it is, at best, a contraction of do and n't. It's eliding, not punctuating. In the comments, he also

Re: Another take on the English apostrophe in Unicode

2015-06-04 Thread David Starner
, the work ack-ack isn't decomposable into words, or even morphemes, ack and ack. Leo On Thu, Jun 4, 2015 at 6:31 PM, David Starner prosfil...@gmail.com wrote: On Thu, Jun 4, 2015 at 2:38 PM Markus Scherer markus@gmail.com wrote: don’t is a contraction of two words, it is not one word

Re: Custom characters (was: Re: Private Use Area in Use)

2015-06-04 Thread David Starner
On Thu, Jun 4, 2015 at 6:09 AM John idou...@gmail.com wrote: Mostly just a matter of upgrading the character size. Which totally blows any concern with text size out of the water. Using 30 bytes to define certain very rare characters and 1 byte to define ASCII is way better then using 8 bytes

Re: Tag characters and in-line graphics (from Tag characters)

2015-06-03 Thread David Starner
On Wed, Jun 3, 2015 at 5:46 PM Chris idou...@gmail.com wrote: I personally think emoji should have one, single definitive representation for this exact reason. Then you want an image. I don't see what's hard about that. The community interested in tony the tiger can make decisions like

Re: Tag characters and in-line graphics (from Tag characters)

2015-06-03 Thread David Starner
Chris wrote: There is no way to compare 2 HTML elements and know they are talking about the same character That's because character identity is a hard problem. Is the emoji TIGER the same as TONY THE TIGER or as TONY THE TIGER GIVING THE VICTORY SIGN?

Re: Sencoten and Unicode policy (was: the usage of LATIN SMALL LETTER A WITH STROKE)

2015-06-01 Thread David Starner
On Mon, Jun 1, 2015 at 4:49 AM Janusz S. Bień jsb...@mimuw.edu.pl wrote: The document's author states: Although they could be made up of Letter + overlay diacritic, it is my understanding that the Unicode Consortium would prefer to create unique code points for these types of

Re: the usage of LATIN SMALL LETTER A WITH STROKE

2015-05-31 Thread David Starner
On Sun, May 31, 2015 at 11:09 AM Janusz S. Bien jsb...@mimuw.edu.pl wrote: The proposal makes me curious about past and present Unicode policy, e.g. would it be accepted if submitted now. Why wouldn't it? Unicode has, if anything, seemed to become more flexible about adding characters that

Re: Tag characters and in-line graphics (from Tag characters)

2015-05-30 Thread David Starner
I would say that a system would conform with Unicode in having yellow heart red (in a non-monochrome font) as well as if it made it a cross. Either way it's violating character identity. I'd say that being monochromatic is now like being monospaced; it's suboptimal for a Unicode implementation,

Re: Origin of the digital encoding of accented characters for Esperanto

2015-03-23 Thread David Starner
On Mon, Mar 23, 2015 at 8:35 AM, William_J_G Overington wjgo_10...@btinternet.com wrote: It does not seem axiomatic that accented characters for Esperanto would necessarily be included in a digital encoding of the accented characters needed for the languages of Europe. Where does languages of

Re: CSUR Tonal

2015-03-15 Thread David Starner
On Sat, Mar 14, 2015 at 2:47 PM, Luke Dashjr l...@dashjr.org wrote: On Saturday, March 14, 2015 9:27:56 PM David Starner wrote: On Sat, Mar 14, 2015 at 9:17 AM, Luke Dashjr l...@dashjr.org wrote: Does Unicode give any relevance to non-visual rendering, or do TTS just need to settle

Re: CSUR Tonal

2015-03-14 Thread David Starner
On Sat, Mar 14, 2015 at 9:17 AM, Luke Dashjr l...@dashjr.org wrote: Does Unicode give any relevance to non-visual rendering, or do TTS just need to settle for environmental hints (eg, the user explicitly telling it tonal numbers are in use)? How do you tell a chemist from the general populace?

Re: The rapid … erosion of definition ability

2014-11-17 Thread David Starner
On Mon, Nov 17, 2014 at 3:10 AM, Andreas Stötzner a...@signographie.de wrote: Am 17.11.2014 um 11:46 schrieb Leonardo Boiko: Sign is too general in its generality it is just perfect. The sets of signs in question are most general, covering much more matters, objects and topics than the

Re: Terms for rotations

2014-11-10 Thread David Starner
On Mon, Nov 10, 2014 at 4:12 PM, Whistler, Ken ken.whist...@sap.com wrote: Seriously, I think that Ilya's point is well-taken. Although in English there is a strong association of the phrase turn to the right with clockwise motion for control devices which rotate, if you take the phrase out of

RE: Terms for rotations

2014-11-07 Thread David Starner
I don't think sign writing is the best analogy. Fairy chess starts with the basic set of six chess symbols, like a lot of linguists start with the 26 basic Latin characters. Likewise, because fairy chess has a smaller printing budget then even linguistics, instead of creating new characters, old

Re: Current support for N'Ko

2014-09-29 Thread David Starner
On Fri, Sep 26, 2014 at 4:10 PM, Andrew Cunningham lang.supp...@gmail.com wrote: * NEVER try to copy and paste text from PDF. It is a preprint format and should be treated as such. I'd try and cut and paste from print if I could. People are going to cut and paste from anything if it saves them

Re: Unencoded cased scripts and unencoded titlecase letters

2014-07-03 Thread David Starner
On Wed, Jul 2, 2014 at 7:39 AM, Karl Williamson pub...@khwilliamson.com wrote: It's my sense that there are very few cased scripts in existence that are ever likely to be encoded by Unicode that haven't already been so-encoded. Michael Everson is working on making Cherokee to be a cased script,

Re: Corrigendum #9

2014-07-02 Thread David Starner
On Wed, Jul 2, 2014 at 8:02 AM, Karl Williamson pub...@khwilliamson.com wrote: In UTF-8, an example would be that Sun, I'm told, and for reasons I've forgotten or never knew, did not want raw NUL bytes to appear in text streams, so used the overlong sequence \xC0\x80 to represent them;

Re: Characters that should be displayed?

2014-06-30 Thread David Starner
On Sun, Jun 29, 2014 at 10:00 PM, Jukka K. Korpela jkorp...@cs.tut.fi wrote: Applications that operate on plain text and use one fixed but configurable font are a much better example. If you need to use, say, a currency symbol that has not yet been added to Unicode but can be included in the

Re: Characters that should be displayed?

2014-06-30 Thread David Starner
On Mon, Jun 30, 2014 at 11:35 AM, Koji Ishii kojii...@gluesoft.co.jp wrote: I understand some here wants to display them to help users to identify broken characters, some consider it doesn’t help users at all. I tend to agree with the later, but either way, it’s about helping users to fix their

Re: Characters that should be displayed?

2014-06-30 Thread David Starner
On Mon, Jun 30, 2014 at 9:12 PM, Koji Ishii kojii...@gluesoft.co.jp wrote: Thanks for the reply. It’s very likely that the page contains images, borders, background, etc., so I can recognize all the text are missing. But neither of text missing nor text garbled suggests me how to fix it. I’d

Re: Characters that should be displayed?

2014-06-29 Thread David Starner
On Sun, Jun 29, 2014 at 2:02 PM, Jukka K. Korpela jkorp...@cs.tut.fi wrote: They might be seen as “not displayable by normal rendering”, so yes. On the practical side, although Private Use characters should not be used in public information interchange, they are increasingly popular in “icon

Re: Corrigendum #9

2014-06-12 Thread David Starner
On Thu, Jun 12, 2014 at 1:37 AM, Markus Scherer markus@gmail.com wrote: If your library makes an explict promise to remove noncharacters, then it should continue to do so. There is rarely so much frustration as when a library or utility changes behavior and the justification is that

Re: Swift

2014-06-05 Thread David Starner
On Thu, Jun 5, 2014 at 3:04 AM, J. Leslie Turriff jlturr...@centurylink.net wrote: What I find interesting is that (with the possible exception of Ada) I don't think that any of the commonly used languages allow for the use of Unicode characters for non- user-defined tokens (i.e.

Re: Swift

2014-06-05 Thread David Starner
On Thu, Jun 5, 2014 at 11:14 AM, J. Leslie Turriff jlturr...@centurylink.net wrote: All true; but do any languages allow for keywords (if, then, else, do, while, until, end, iterate, leave, call return, exit,...) to be expressed in the programmer's locale? Both ALGOL 60 and ALGOL 68

Re: Swift

2014-06-05 Thread David Starner
On Thu, Jun 5, 2014 at 5:00 PM, Whistler, Ken ken.whist...@sap.com wrote: Any programming language project that derives from someone who describes himself as a “polyhistor”, which claims to be polymorphic and pasigraphic and multi-lingual and orthogonal and polysynthetic, which draws its

Re: Math input methods

2014-06-04 Thread David Starner
On Wed, Jun 4, 2014 at 6:00 AM, Jukka K. Korpela jkorp...@cs.tut.fi wrote: The change is logical in the sense that bold face is a more original notation and double-struck letters as characters imitate the imitation of boldface letters when writing by hand (with a pen or piece of chalk). On

Re: Corrigendum #9

2014-06-03 Thread David Starner
On Mon, Jun 2, 2014 at 4:33 PM, Richard Wordingham richard.wording...@ntlworld.com wrote: Much as I don't like their uninvited use, it is possible to pass them and other undesirables through most applications by a slight bit of recoding at the application's boundaries. Using 99 = (3 + 32 + 64)

Re: Corrigendum #9

2014-06-03 Thread David Starner
On Mon, Jun 2, 2014 at 11:55 PM, Mark Davis ☕️ m...@macchiato.com wrote: Thinking that a utility would never encounter them in input text was a pipe-dream. Thinking that a utility would never mangle them if encountered in input text was a pipe-dream. If a utility or library is so fragile that

Re: Corrigendum #9

2014-06-03 Thread David Starner
On Tue, Jun 3, 2014 at 12:31 AM, Richard Wordingham richard.wording...@ntlworld.com wrote: On Mon, 2 Jun 2014 23:21:38 -0700 David Starner prosfil...@gmail.com wrote: On Mon, Jun 2, 2014 at 4:33 PM, Richard Wordingham richard.wording...@ntlworld.com wrote: Using 99 = (3 + 32 + 64) PUA

Re: Corrigendum #9

2014-06-02 Thread David Starner
On Mon, Jun 2, 2014 at 8:48 AM, Markus Scherer markus@gmail.com wrote: Right, in principle. However, it should be ok to include noncharacters in CLDR data files for processing by CLDR implementations, and it should be possible to edit and diff and version-control and web-view those files

Re: Corrigendum #9

2014-06-02 Thread David Starner
On Mon, Jun 2, 2014 at 2:53 PM, Markus Scherer markus@gmail.com wrote: On Mon, Jun 2, 2014 at 1:32 PM, David Starner prosfil...@gmail.com wrote: I would especially discourage any web browser from handling these; they're noncharacters used for unknown purposes that are undisplayable

Re: Romanized Singhala got great reception in Sri Lanka

2014-03-16 Thread David Starner
On Sat, Mar 15, 2014 at 9:12 PM, Naena Guru naenag...@gmail.com wrote: I made a presentation demonstrating Dual-script Singhala at National Science Foundation of Sri Lanka. Most of the attendees were government employees and media representatives; a few private citizens came too. I don't know

Re: Romanized Singhala got great reception in Sri Lanka

2014-03-16 Thread David Starner
On Sun, Mar 16, 2014 at 5:12 AM, Jean-François Colson j...@colson.eu wrote: Le 16/03/14 08:15, David Starner a écrit : I don't know what the point was of sending this message. You claim that Unicode Sinhala was outside the subject matter of the presentation, so why would you post

Re: Unicode organization is still anti-Serbian and anti-Macedonian

2014-02-16 Thread David Starner
Every time you attack the only character set that supports various third-world African languages and various tiny North American languages and various small Indian languages and various Philippine scripts, as it's easy for you latin-oriented nations (USA, Germany...) to ignore the rest of the

  1   2   3   4   >