Re: Is ARMENIAN ABBREVIATION MARK (՟, U+055F) misclassified?

2019-04-27 Thread Richard Wordingham via Unicode
On Sat, 27 Apr 2019 00:08:52 + James Kass via Unicode wrote: > On 2019-04-26 11:08 PM, Doug Ewell via Unicode wrote: > > This is a small percentage of the number of fonts that have all > > four of these Armenian glyphs, but show the abbreviation mark as a > > spacing glyph. It looks like

Fw: Latin Script Danda

2019-04-19 Thread Richard Wordingham via Unicode
Begin forwarded message: Date: Fri, 19 Apr 2019 11:30:32 +0100 From: Richard Wordingham To: Shriramana Sharma Subject: Re: Latin Script Danda On Fri, 19 Apr 2019 11:33:35 +0530 Shriramana Sharma via Unicode wrote: > We are using the pipe character as it is readily available in our >

Re: Script_extension Property of U+0310 Combining Candrabindu

2019-04-19 Thread Richard Wordingham via Unicode
On Fri, 19 Apr 2019 19:54:47 +0530 Shriramana Sharma wrote: > Or maybe the Grantha candrabindu can be used, since there is already > evidence for mixed usage of the scripts and nukta characters have been > encoded for Tamil usage in the Grantha block for this same reason > despite Grantha users

Re: Script_extension Property of U+0310 Combining Candrabindu

2019-04-19 Thread Richard Wordingham via Unicode
On Fri, 19 Apr 2019 11:36:16 +0530 Shriramana Sharma via Unicode wrote: > On 4/19/19, Richard Wordingham via Unicode > wrote: > > That's a fair point. My problem is that someone is claiming of > > U+0310 that "Somewhere in the Unicode specifications is a footnote >

Re: Script_extension Property of U+0310 Combining Candrabindu

2019-04-18 Thread Richard Wordingham via Unicode
On Fri, 19 Apr 2019 01:52:15 +0200 Marius Spix via Unicode wrote: > The Wikipedia page states, U+0310 is a general-purpose combining > diacritical mark. I would treat it similar like U+0308 (COMBINING > DIAERESIS) or U+030C (COMBINING CARON), which are both characters with > multiple names and

Latin Script Danda

2019-04-18 Thread Richard Wordingham via Unicode
Which character should one use for a danda in the Latin script? I believed normal usage is to use U+0964 DEVANAGARI DANDA, but for some reason its script extension property does not include the Latin script. Richard.

Script_extension Property of U+0310 Combining Candrabindu

2019-04-18 Thread Richard Wordingham via Unicode
Is there any reason why U+0310 COMBINING CANDRABINDU has scx=Inherited rather than scx=Latn? The only language I've seen the character used in is Sanskrit, and the only script I've seen it in is the Latin script. Richard.

Re: USE Indic Syllabic Category

2019-02-24 Thread Richard Wordingham via Unicode
On Sat, 23 Feb 2019 14:46:27 +0800 梁海 Liang Hai via Unicode wrote: > USE wasn’t designed to allow such a syllable structure. Tai Tham’s > being supported by USE is kind of an oversight. And although it’s > appropriate to allow conjoined consonants to follow post-base-spacing > vowel signs,

Re: USE Indic Syllabic Category

2019-02-23 Thread Richard Wordingham via Unicode
On Sat, 23 Feb 2019 14:46:27 +0800 梁海 Liang Hai via Unicode wrote: > >>> once the USE acknowledges that subjoined consonants may follow > >>> vowels > >> > >> I expect to update the USE spec to address this soon. > > > > That seems welcome news. I still don't know what the problem with

Re: USE Indic Syllabic Category

2019-02-23 Thread Richard Wordingham via Unicode
On Sat, 23 Feb 2019 14:46:27 +0800 梁海 Liang Hai via Unicode wrote: > >>> once the USE acknowledges that subjoined consonants may follow > >>> vowels > >> > >> I expect to update the USE spec to address this soon. > > > > That seems welcome news. I still don't know what the problem with

Re: USE Indic Syllabic Category

2019-02-22 Thread Richard Wordingham via Unicode
On Fri, 22 Feb 2019 22:19:25 + Andrew Glass wrote: > Thank you Richard for pointing out the issue with 0x1A7A > I've looked into this and found an error in our tooling that has this > mapped this to Halant. Based on the spec this should be VAbv. I've > filed a bug. Thanks. Will the

Re: USE Indic Syllabic Category

2019-02-22 Thread Richard Wordingham via Unicode
On Fri, 22 Feb 2019 09:07:06 + Richard Wordingham via Unicode wrote: > My best hypothesis (not thoroughly tested) is that Windows currently > has InSc=Consonant_Killer, but can I look his up as opposed to > effectively devising a test suite for USE on Office? That question's rathe

USE Indic Syllabic Category

2019-02-22 Thread Richard Wordingham via Unicode
Where can I find the InSc properties of characters as overridden for the USE of Windows? I am trying to work out why on MS Edge I am now getting dotted circles before U+1A7A TAI THAM SIGN RA HAAM in all of: ᩆᩢᨠ᩠ᨯᩥ᩺ rank /sak/ , ᨾᩉᩣᩉᩥᨦ᩠ᨣᩩ᩺ giant fennel /ma haː hiŋ/ and ᩆᩣᩈ᩠ᨲᩕ᩺ science /saːt/

Re: Bidi paragraph direction in terminal emulators

2019-02-12 Thread Richard Wordingham via Unicode
On Tue, 12 Feb 2019 13:50:00 +0100 Egmont Koblinger via Unicode wrote: > For > starter, I'd love to see a shell with interactive line editing (like > bash, zsh),... Bash already seems to handle proportional fonts quite well when run under Emacs 'M-x shell', which is more than can be said for

Re: Bidi paragraph direction in terminal emulators

2019-02-10 Thread Richard Wordingham via Unicode
On Sun, 10 Feb 2019 14:54:39 +0100 Philippe Verdy via Unicode wrote: > Le sam. 9 févr. 2019 à 20:55, Egmont Koblinger via Unicode < > unicode@unicode.org> a écrit : > > > Hi Asmus, > > > > > On quick reading this appears to be a strong argument why such > > > emulators > > will > > >

Re: Bidi paragraph direction in terminal emulators

2019-02-09 Thread Richard Wordingham via Unicode
On Sun, 10 Feb 2019 00:59:46 +0100 Egmont Koblinger via Unicode wrote: > Is there such a monospace font obeying wcwidth (that is: double wide > character for when a spacing mark is combined) for Devanagari? For CV, that would correspond to a Hindi typewriter, so the odds look good. The

Re: Bidi paragraph direction in terminal emulators

2019-02-09 Thread Richard Wordingham via Unicode
On Sat, 9 Feb 2019 18:42:52 +0100 Egmont Koblinger via Unicode wrote: > The > problem that I don't know how to address is: What if harfbuzz tells us > that the overall width for rendering a particular grapheme cluster is > significantly different from its designated area (the number of >

Re: Bidi paragraph direction in terminal emulators

2019-02-09 Thread Richard Wordingham via Unicode
On Sat, 9 Feb 2019 22:29:31 +0100 Adam Borowski via Unicode wrote: > On Sat, Feb 09, 2019 at 10:01:21PM +0200, Eli Zaretskii via Unicode > wrote: > > I don't know. Maybe it keeps a database of character combinations > > that need shaping, each one with the maximum width on display the > >

Re: Bidi paragraph direction in terminal emulators

2019-02-09 Thread Richard Wordingham via Unicode
On Sat, 9 Feb 2019 22:31:37 +0100 Egmont Koblinger via Unicode wrote: > Let's take the Devanagari improvement of the other day. Until now, > there were plenty of dotted circles shown, and combining spacing marks > that should've been placed before the letter were placed after the > letter,

Re: Bidi paragraph direction in terminal emulators

2019-02-09 Thread Richard Wordingham via Unicode
On Sat, 9 Feb 2019 13:02:55 -0800 "Asmus Freytag \(c\) via Unicode" wrote: > To force Hindi crosswords mode you need to segment the string into > syllables, > each having a variable number of characters, and then assign a single > display > position to them. Now some syllables are wider than

Re: Encoding italic

2019-02-09 Thread Richard Wordingham via Unicode
On Sat, 9 Feb 2019 04:52:30 -0800 David Starner via Unicode wrote: > Note that this is actually the only thing that stands out to me in > Unicode not supporting older character sets; in PETSCII (Commodore > 64), the high-bit character characters were the reverse (in this > sense) of the low-bit

Re: Bidi paragraph direction in terminal emulators

2019-02-09 Thread Richard Wordingham via Unicode
On Sat, 09 Feb 2019 09:42:09 +0200 Eli Zaretskii via Unicode wrote: > > Date: Sat, 9 Feb 2019 00:18:14 + > > From: Richard Wordingham via Unicode > > > > > For character composition, you must have a shaping engine to talk > > > to, and the shap

Re: Encoding italic

2019-02-09 Thread Richard Wordingham via Unicode
On Fri, 8 Feb 2019 18:08:34 -0800 Asmus Freytag via Unicode wrote: > On 2/8/2019 5:42 PM, James Kass via Unicode wrote: > You are still making the assumption that selecting a different glyph > for the base character would automatically lead to the selection of a > different glyph for the

Re: Encoding italic

2019-02-08 Thread Richard Wordingham via Unicode
On Fri, 8 Feb 2019 14:26:28 -0800 Asmus Freytag via Unicode wrote: > On 2/8/2019 2:08 PM, Richard Wordingham via Unicode wrote: > On Fri, 8 Feb 2019 17:16:09 + (GMT) > "wjgo_10...@btinternet.com via Unicode" wrote: > > Andrew West wrote: > > Just reminding

Re: Bidi paragraph direction in terminal emulators

2019-02-08 Thread Richard Wordingham via Unicode
On Sat, 09 Feb 2019 00:16:30 +0200 Eli Zaretskii via Unicode wrote: > > Date: Fri, 8 Feb 2019 21:55:58 + > > From: Richard Wordingham via Unicode > > I will give a concrete application. If I want to make a font that > > is interpretable for Tai Tham and maximal

Re: Encoding italic

2019-02-08 Thread Richard Wordingham via Unicode
On Fri, 8 Feb 2019 22:29:57 +0100 Egmont Koblinger via Unicode wrote: > Some terminal emulators have made up some new SGR modes, e.g. ESC[4:3m > for curly underline. What to do with them? Where to draw the line what > to add to Unicode and what not to? Will Unicode possibly be a > bottleneck of

Re: Encoding italic

2019-02-08 Thread Richard Wordingham via Unicode
On Fri, 8 Feb 2019 17:16:09 + (GMT) "wjgo_10...@btinternet.com via Unicode" wrote: > Andrew West wrote: >> Just reminding you that "The initial character in a variation >> sequence >> is never a nonspacing combining mark (gc=Mn) or a canonical >> decomposable character" (The Unicode

Re: Bidi paragraph direction in terminal emulators

2019-02-08 Thread Richard Wordingham via Unicode
On Fri, 08 Feb 2019 11:34:29 +0200 Eli Zaretskii via Unicode wrote: > > Date: Fri, 8 Feb 2019 06:40:44 + > > From: Richard Wordingham via Unicode > > > > > I, for one, am not to the slightest bit interested in abandoning > > > the character grid

Columns in Terminal Emulators (was: Bidi paragraph direction in terminal emulators)

2019-02-08 Thread Richard Wordingham via Unicode
On Fri, 08 Feb 2019 15:45:15 +0200 Eli Zaretskii via Unicode wrote: > > From: Egmont Koblinger > > Date: Fri, 8 Feb 2019 13:30:42 +0100 > > Cc: Richard Wordingham , > > unicode Unicode Discussion > > > > Hi Eli, > > > > > Not sure why. There are terminal emulators out there which > >

Re: Bidi paragraph direction in terminal emulators (was: Proposal for BiDi in terminal emulators)

2019-02-07 Thread Richard Wordingham via Unicode
On Fri, 8 Feb 2019 00:38:24 +0100 Egmont Koblinger via Unicode wrote: > I, for one, am not to the slightest bit interested in abandoning the > character grid and allowing for proportional fonts. This would just > break a gazillion of things. The message I take from that and this thread in

Re: Bidi paragraph direction in terminal emulators BiDi in terminal emulators

2019-02-07 Thread Richard Wordingham via Unicode
On Thu, 07 Feb 2019 22:00:20 +0200 Eli Zaretskii via Unicode wrote: > > From: Egmont Koblinger > > Date: Thu, 7 Feb 2019 19:01:33 +0100 > > On Thu, Feb 7, 2019 at 6:53 PM Eli Zaretskii wrote: > > > No, it needs no interaction. Unless the regexp doesn't work for > > > you, which you should

Re: Bidi paragraph direction in terminal emulators BiDi in terminal emulators

2019-02-07 Thread Richard Wordingham via Unicode
On Thu, 7 Feb 2019 00:45:55 +0100 Egmont Koblinger via Unicode wrote: > Hi Richard, > > > Not necessarily. One could allow the first strong character in the > > prompt to determine the paragraph directions > > How does Emacs know what's a prompt? How can it tell it from the > previous and

Re: Bidi paragraph direction in terminal emulators BiDi in terminal emulators

2019-02-06 Thread Richard Wordingham via Unicode
On Wed, 6 Feb 2019 22:01:59 +0100 Egmont Koblinger via Unicode wrote: > Hi Eli, > > (I'm getting lost where to reply, and how the subject gets mangled and > the thread split into different ones.) > > > I've thought about it a lot, experimented with Emacs's behavior, and > I've arrived at the

Re: Encoding italic

2019-02-05 Thread Richard Wordingham via Unicode
On Tue, 5 Feb 2019 16:01:41 + Andrew West via Unicode wrote: > You would > have to first convert any text to be italicized to NFD, then apply > VS14 to each non-combining character. This alone would make a VS > solution unacceptable in my opinion. What is so unacceptable about having to do

Re: Bidi paragraph direction in terminal emulators (was: Proposal for BiDi in terminal emulators)

2019-02-04 Thread Richard Wordingham via Unicode
On Mon, 4 Feb 2019 22:27:39 +0100 Egmont Koblinger via Unicode wrote: > Hi Richard, > > > The concept appears to exist in the form of the fields of the > > fifth edition of ECMA-48. Have you digested this ambitious > > standard? > > To be honest: No, I haven't. And I have no idea what those

Re: Bidi paragraph direction in terminal emulators BiDi in terminal emulators)

2019-02-04 Thread Richard Wordingham via Unicode
On Tue, 5 Feb 2019 00:08:10 +0100 Egmont Koblinger via Unicode wrote: > Hi Eli, > > > Actually, UAX#9 defines "paragraph" as the chunk of text delimited > > by paragraph separator characters. This means characters whose bidi > > category is B, which includes Newline, the CR-LF pair on Windows,

Re: Bidi paragraph direction in terminal emulators (was: Proposal for BiDi in terminal emulators)

2019-02-04 Thread Richard Wordingham via Unicode
On Mon, 04 Feb 2019 22:39:07 +0200 Eli Zaretskii via Unicode wrote: > > Date: Mon, 4 Feb 2019 19:45:13 + > > From: Richard Wordingham via Unicode > > > > Yes. If one has a text composed of LTR and RTL paragraphs, one has > > to choose how far apart their

Re: Proposal for BiDi in terminal emulators

2019-02-04 Thread Richard Wordingham via Unicode
On Sun, 3 Feb 2019 20:50:03 + Richard Wordingham via Unicode wrote: > On Sun, 03 Feb 2019 20:07:51 +0200 > Eli Zaretskii via Unicode wrote: > Which is why I try to remember to issue the emacs command 'M-x shell' > command and issue grep commands from the buffer cre

Re: Proposal for BiDi in terminal emulators

2019-02-04 Thread Richard Wordingham via Unicode
On Sun, 03 Feb 2019 18:03:37 +0200 Eli Zaretskii via Unicode wrote: > > Date: Sun, 3 Feb 2019 03:02:13 +0100 > > Cc: unicode@unicode.org > > From: Egmont Koblinger via Unicode > > > > > All I am saying is that your proposal should define what it means > > > by visual order. > > > > Are

Re: Bidi paragraph direction in terminal emulators (was: Proposal for BiDi in terminal emulators)

2019-02-04 Thread Richard Wordingham via Unicode
On Mon, 04 Feb 2019 18:53:22 +0200 Eli Zaretskii via Unicode wrote: > Date: Mon, 4 Feb 2019 01:19:21 + > From: Richard Wordingham via Unicode >> If you look at it in Notepad, all >> lines will be LTR or all lines will be RTL. > That's because Notepad implements _o

Re: Bidi paragraph direction in terminal emulators (was: Proposal for BiDi in terminal emulators)

2019-02-03 Thread Richard Wordingham via Unicode
On Mon, 4 Feb 2019 00:36:23 +0100 Egmont Koblinger via Unicode wrote: > Now, back to terminals. > > The smallest possible viable definition of a "paragraph" in terminal > emulators is stuff between one newline and the next one. > > It would require a hell lot of work, redesigning

Re: Bidi paragraph direction in terminal emulators (was: Proposal for BiDi in terminal emulators)

2019-02-03 Thread Richard Wordingham via Unicode
On Mon, 4 Feb 2019 00:36:23 +0100 Egmont Koblinger via Unicode wrote: > I wish to store and deliver the following text, as it's layed out here > in logical order. That is, the order as the bytes appear in the text > file, as I typed them from the keyboard, is laid out here strictly > from left

Re: Bidi paragraph direction in terminal emulators (was: Proposal for BiDi in terminal emulators)

2019-02-03 Thread Richard Wordingham via Unicode
On Sun, 03 Feb 2019 19:50:50 +0200 Eli Zaretskii via Unicode wrote: > Do you see how this is carefully formatted to avoid overflowing an > 80-column line of a typical terminal? Now suppose this is translated > into a RTL language, which causes the Copyright line to start with a > strong R

Re: Proposal for BiDi in terminal emulators

2019-02-03 Thread Richard Wordingham via Unicode
On Sun, 03 Feb 2019 18:13:06 +0200 Eli Zaretskii via Unicode wrote: > Actually, you pass the characters to be shaped in logical order, and > then display the produced grapheme clusters in visual order. Some early systems supporting computerised Hebrew script did pass characters in left-to-right

Re: Proposal for BiDi in terminal emulators

2019-02-03 Thread Richard Wordingham via Unicode
On Sun, 03 Feb 2019 20:07:51 +0200 Eli Zaretskii via Unicode wrote: > > Date: Sun, 3 Feb 2019 17:45:06 + > > From: Richard Wordingham via Unicode > > > > > > So, what do you recommend I run grep from for Hebrew or Tai > > > > Lue? > >

Re: Proposal for BiDi in terminal emulators

2019-02-03 Thread Richard Wordingham via Unicode
On Sun, 03 Feb 2019 18:05:49 +0200 Eli Zaretskii via Unicode wrote: > > Date: Sat, 2 Feb 2019 21:49:40 + > > From: Richard Wordingham via Unicode > > > > Eli will probably tell me I'm behind the times, but there are a few > > places where a Gnome-terminal is b

Re: Proposal for BiDi in terminal emulators

2019-02-03 Thread Richard Wordingham via Unicode
On Sun, 03 Feb 2019 18:14:53 +0200 Eli Zaretskii via Unicode wrote: > > Date: Sun, 3 Feb 2019 02:43:06 + > > Cc: Kent Karlsson > > From: Richard Wordingham via Unicode > > > > So, what do you recommend I run grep from for Hebrew or Tai Lue? > > In

Re: Proposal for BiDi in terminal emulators

2019-02-02 Thread Richard Wordingham via Unicode
On Sun, 03 Feb 2019 02:01:18 +0100 Kent Karlsson via Unicode wrote: > Den 2019-02-02 16:12, skrev "Richard Wordingham via Unicode" > : > > Doesn't Jerusalem in biblical Hebrew sometime have 3 marks below the > > lamedh? The depth then is the maximum depth, n

Re: Proposal for BiDi in terminal emulators

2019-02-02 Thread Richard Wordingham via Unicode
On Sat, 2 Feb 2019 23:02:10 +0100 Egmont Koblinger via Unicode wrote: > Hi Richard, > > On Sat, Feb 2, 2019 at 9:57 PM Richard Wordingham > wrote: > > > Seriously, you need to give a definition of 'visual order' for this > > context. Not everyone shares your chiralist view. > > When I

Re: Proposal for BiDi in terminal emulators

2019-02-02 Thread Richard Wordingham via Unicode
On Sat, 2 Feb 2019 12:54:16 +0100 Egmont Koblinger via Unicode wrote: > Hi Richard, > > > > Are they okay to be present in visual order (the terminal's > > > explicit mode, what we're discussing now) too? > > > > Where do you define the order for explicit mode? > > In explicit mode, the

Re: Proposal for BiDi in terminal emulators

2019-02-02 Thread Richard Wordingham via Unicode
On Sat, 2 Feb 2019 13:18:03 +0100 Egmont Koblinger via Unicode wrote: > Hi Richard, > > On Sat, Feb 2, 2019 at 12:43 PM Richard Wordingham via Unicode > wrote: > > > I'm not conversant with the details of terminal controls and I > > haven't used fields. However, w

Re: Proposal for BiDi in terminal emulators

2019-02-02 Thread Richard Wordingham via Unicode
On Sat, 02 Feb 2019 14:01:46 +0100 Kent Karlsson via Unicode wrote: > Den 2019-02-02 12:17, skrev "Egmont Koblinger" : > > Most terminal emulators handle non-spacing combining marks, it's a > > piece of cake. (Spacing marks are more problematic.) > Well, I guess you may need to put some

Re: Proposal for BiDi in terminal emulators

2019-02-02 Thread Richard Wordingham via Unicode
On Fri, 1 Feb 2019 15:15:53 +0100 Egmont Koblinger via Unicode wrote: > Hi Richard, > > On Fri, Feb 1, 2019 at 12:19 AM Richard Wordingham via Unicode > wrote: > > > Cropped why? If the problem is the truncation of lines, one can > > simple store the next charac

Re: Proposal for BiDi in terminal emulators

2019-02-01 Thread Richard Wordingham via Unicode
On Fri, 1 Feb 2019 15:15:53 +0100 Egmont Koblinger via Unicode wrote: > Hi Richard, > > On Fri, Feb 1, 2019 at 12:19 AM Richard Wordingham via Unicode > wrote: > > > Cropped why? If the problem is the truncation of lines, one can > > simple store the next charac

Re: Proposal for BiDi in terminal emulators

2019-02-01 Thread Richard Wordingham via Unicode
On Sat, 02 Feb 2019 00:38:04 +0100 Kent Karlsson via Unicode wrote: > Den 2019-02-01 19:57, skrev "Richard Wordingham via Unicode" > : > "Monospaced font" is really a concept with modification. Even for > "plain old ASCII" there are two advance widths

Re: Proposal for BiDi in terminal emulators

2019-02-01 Thread Richard Wordingham via Unicode
On Fri, 01 Feb 2019 15:18:13 -0700 Doug Ewell via Unicode wrote: > Richard Wordingham wrote: > > > Language tagging is already available in Unicode, via the tag > > characters in the deprecated plane. > > Plane 14 isn't deprecated -- that isn't a property of planes -- and > the tag

Re: Proposal for BiDi in terminal emulators

2019-02-01 Thread Richard Wordingham via Unicode
On Fri, 1 Feb 2019 14:47:22 +0100 Egmont Koblinger via Unicode wrote: > Hi Ken, > > > [language tag] > > That is a complete non-starter for the Unicode Standard. > > Thanks for your input! > > (I hope it was clear that I just started throwing in random ideas, as > in a brainstorming

Re: Proposal for BiDi in terminal emulators

2019-02-01 Thread Richard Wordingham via Unicode
On Fri, 1 Feb 2019 13:02:45 +0200 Khaled Hosny via Unicode wrote: > On Thu, Jan 31, 2019 at 11:17:19PM +0000, Richard Wordingham via > Unicode wrote: > > On Thu, 31 Jan 2019 12:46:48 +0100 > > Egmont Koblinger wrote: > > > > No. How many cells do CJK ideograp

Re: Proposal for BiDi in terminal emulators

2019-01-31 Thread Richard Wordingham via Unicode
On Thu, 31 Jan 2019 12:46:48 +0100 Egmont Koblinger wrote: > Hi Richard, > > > Basic Arabic shaping, at the level of a typewriter, is > > straightforward enough to leave to a terminal emulator, as Eli has > > suggested. > > What is "basic" Arabic shaping exactly? Just using initial, medial

Re: Proposal for BiDi in terminal emulators

2019-01-31 Thread Richard Wordingham via Unicode
On Thu, 31 Jan 2019 08:28:41 + Martin J. Dürst via Unicode wrote: > > Basic Arabic shaping, at the level of a typewriter, is > > straightforward enough to leave to a terminal emulator, as Eli has > > suggested. Lam-alif would be trickier - one cell or two? > > Same for other characters.

Re: Proposal for BiDi in terminal emulators

2019-01-31 Thread Richard Wordingham via Unicode
On Wed, 30 Jan 2019 20:35:36 -0500 "Mark E. Shoulson via Unicode" wrote: > On 1/30/19 8:58 AM, Egmont Koblinger via Unicode wrote: > > There's another side to the entire BiDi story, though. Simple > > utilities like "echo", "cat", "ls", "grep" and so on, line editing > > experience of your

Re: Proposal for BiDi in terminal emulators

2019-01-30 Thread Richard Wordingham via Unicode
On Wed, 30 Jan 2019 15:33:38 +0100 Frédéric Grosshans via Unicode wrote: > Le 30/01/2019 à 14:36, Egmont Koblinger via Unicode a écrit : > > - It doesn't do Arabic shaping. In my recommendation I'm arguing > > that in this mode, where shuffling the characters is the task of > > the text editor

Re: Ancient Greek apostrophe marking elision

2019-01-29 Thread Richard Wordingham via Unicode
On Mon, 28 Jan 2019 20:55:39 -0500 "Mark E. Shoulson via Unicode" wrote: > On 1/28/19 2:31 AM, Mark Davis ☕️ via Unicode wrote: > > > > But the question is how important those are in daily life. I'm not > > sure why the double-click selection behavior is so much more of a > > problem for

Re: Ancient Greek apostrophe marking elision

2019-01-29 Thread Richard Wordingham via Unicode
On Mon, 28 Jan 2019 21:10:19 -0500 "Mark E. Shoulson via Unicode" wrote: > On 1/28/19 3:58 PM, Richard Wordingham via Unicode wrote: > > Interestingly, bringing this word breaker into line with TUS in the > > UK may well be in breach of the Equality Act 2010. > > &

Re: Ancient Greek apostrophe marking elision

2019-01-28 Thread Richard Wordingham via Unicode
On Mon, 28 Jan 2019 08:31:40 +0100 Mark Davis ☕️ via Unicode wrote: > But the question is how important those are in daily life. I'm not > sure why the double-click selection behavior is so much more of a > problem for Ancient Greek users than it is for the somewhat larger > community of English

Re: Ancient Greek apostrophe marking elision

2019-01-28 Thread Richard Wordingham via Unicode
On Mon, 28 Jan 2019 03:48:52 + James Kass via Unicode wrote: > It’s been said that the text segmentation rules seem over-complicated > and are probably non-trivial to implement properly.  I tried your > suggestion of WORD JOINER U+2060 after tau ( γένοιτ⁠’ ἄν ), but it > only added yet

Re: Ancient Greek apostrophe marking elision

2019-01-27 Thread Richard Wordingham via Unicode
On Sun, 27 Jan 2019 19:57:37 + James Kass via Unicode wrote: > On 2019-01-27 7:09 PM, James Tauber via Unicode wrote: > > In my original post, I asked if a language-specific tailoring of > > the text segmentation algorithm was the solution but no one here > > has agreed so far. > If there

Re: Ancient Greek apostrophe marking elision

2019-01-27 Thread Richard Wordingham via Unicode
On Sun, 27 Jan 2019 14:09:31 -0500 James Tauber via Unicode wrote: > On Sun, Jan 27, 2019 at 1:22 PM Richard Wordingham via Unicode < > unicode@unicode.org> wrote: > > However LibreOffice treats "don't" as a single word for U+0027, > > U+02BC and U+2019,

Re: Ancient Greek apostrophe marking elision

2019-01-27 Thread Richard Wordingham via Unicode
On Sun, 27 Jan 2019 16:11:12 + Michael Everson via Unicode wrote: > Yes, yes. It doesn’t matter. The discussion applies to both the two > quotation marks and the two modifier letters. Actually, there is a difference. As the ʻokina doesnʹt occur at the end of a word in Hawaiian, one only

Re: Ancient Greek apostrophe marking elision

2019-01-27 Thread Richard Wordingham via Unicode
On Sun, 27 Jan 2019 12:38:39 -0500 "Mark E. Shoulson via Unicode" wrote: > On 1/27/19 11:08 AM, Michael Everson via Unicode wrote: > > It is a letter. In “can’t” the apostrophe isn’t a letter. It’s a > > mark of elision. I can double-click on the three words in this > > paragraph which have the

Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Richard Wordingham via Unicode
On Sat, 26 Jan 2019 21:11:36 -0800 Asmus Freytag via Unicode wrote: > On 1/26/2019 5:43 PM, Richard Wordingham via Unicode wrote: >> That appears to contradict Michael Everson's remark about a >> Polynesian >> need to distinguish the two visually. > Why do you need to d

Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Richard Wordingham via Unicode
On Sun, 27 Jan 2019 01:55:29 + James Kass via Unicode wrote: > Richard Wordingham replied to Asmus Freytag, > > >> To make matters worse, users for languages that "should" use > >> U+02BC aren't actually consistent; much data uses U+2019 or > >> U+0027. Ordinary users can't tell the

Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Richard Wordingham via Unicode
On Sat, 26 Jan 2019 17:11:49 -0800 Asmus Freytag via Unicode wrote: > To make matters worse, users for languages that "should" use U+02BC > aren't actually consistent; much data uses U+2019 or U+0027. Ordinary > users can't tell the difference (and spell checkers seem not > successful in

Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Richard Wordingham via Unicode
On Sun, 27 Jan 2019 00:32:43 + Michael Everson via Unicode wrote: > I’ll be publishing a translation of Alice into Ancient Greek in due > course. I will absolutely only use U+2019 for the apostrophe. It > would be wrong for lots of reasons to use U+02BC for this. Please list them. Will

Re: Ancient Greek apostrophe marking elision

2019-01-26 Thread Richard Wordingham via Unicode
On Sat, 26 Jan 2019 15:45:54 + James Kass via Unicode wrote: > Perhaps I'm not understanding, but if the desired behavior is to > prohibit both line and word breaks in the example string, then... > > In Notepad, replacing U+0020 with U+00A0 removes the line-break. I believe the problem is

Re: Ancient Greek apostrophe marking elision

2019-01-25 Thread Richard Wordingham via Unicode
On Fri, 25 Jan 2019 17:02:25 -0500 James Tauber via Unicode wrote: > I guess U+02BC is category Lm not Mn, but doesn't that still mean it > modifies the previous character (i.e. is really part of the same > grapheme cluster) and so isn't appropriate as either a vowel or an > indication of an

Re: Ancient Greek apostrophe marking elision

2019-01-25 Thread Richard Wordingham via Unicode
On Fri, 25 Jan 2019 12:39:47 -0500 James Tauber via Unicode wrote: > Thank you, although the word break does still affect things like > double-clicking to select. > > And people do seem to want to use U+02BC for this reason (and I'm > trying to articulate why that isn't what U+02BC is meant

Re: Encoding italic

2019-01-24 Thread Richard Wordingham via Unicode
On Thu, 24 Jan 2019 18:24:07 +0200 Khaled Hosny via Unicode wrote: > On Thu, Jan 24, 2019 at 03:54:29PM +, Andrew West via Unicode > wrote: >> On Thu, 24 Jan 2019 at 15:42, James Kass >> wrote: >>> Going off topic a little, I saw this tweet from Marijn van Putten >>> today which shows

Re: Encoding italic

2019-01-22 Thread Richard Wordingham via Unicode
On Mon, 21 Jan 2019 00:29:42 -0800 David Starner via Unicode wrote: > The superscripts show a problem with multiple encoding; even if you > think they should be Unicode superscripts, and they look like Unicode > superscripts, they might be HTML superscripts. Same thing would happen > with

Re: Encoding italic

2019-01-19 Thread Richard Wordingham via Unicode
On Sun, 20 Jan 2019 03:14:21 + James Kass via Unicode wrote: > (In the event that a persuasive proposal presentation prompts the > possibility of italics encoding...) The use of italic script isn't just restricted to the Latin script, which includes base characters not supported by the

Re: Encoding italic

2019-01-19 Thread Richard Wordingham via Unicode
On Fri, 18 Jan 2019 10:51:18 -0500 "Mark E. Shoulson via Unicode" wrote: > On 1/16/19 6:23 AM, Victor Gaultney via Unicode wrote: > > > > Encoding 'begin italic' and 'end italic' would introduce > > difficulties when partial strings are moved, etc. But that's no > > different than with current

Re: NNBSP

2019-01-18 Thread Richard Wordingham via Unicode
On Fri, 18 Jan 2019 10:20:22 -0800 Asmus Freytag via Unicode wrote: > However, if there's a consensus interpretation of a given character > the you can't just go in and change it, even if it would make that > character work "better" for a given circumstance: you simply don't > know (unless you

Re: Loose character-name matching

2019-01-18 Thread Richard Wordingham via Unicode
On Thu, 17 Jan 2019 18:44:50 -0500 "J. S. Choi" via Unicode wrote: > I’m implementing a Unicode names library. I’m confused about loose > character-name matching, even after rereading The Unicode Standard § > 4.8, UAX #34 § 4, #44 § 5.9.2 – as well as >

Re: NNBSP

2019-01-17 Thread Richard Wordingham via Unicode
On Thu, 17 Jan 2019 18:35:49 +0100 Marcel Schneider via Unicode wrote: > Among the grievances, Unicode is blamed for confusing Greek psili and > dasia with comma shapes, and for misinterpreting Latin letter forms > such as the u with descender taken for a turned h, and double u > mistaken for a

Re: NNBSP

2019-01-17 Thread Richard Wordingham via Unicode
On Thu, 17 Jan 2019 04:51:57 +0100 Marcel Schneider via Unicode wrote: > Also, at least one French typographer was extremely upset > about Unicode not gathering feedback from typographers. > That blame is partly wrong since at least one typographer > was and still is present in WG2, and even if

NNBSP (was: A last missing link for interoperable representation)

2019-01-16 Thread Richard Wordingham via Unicode
On Tue, 15 Jan 2019 13:25:06 +0100 Philippe Verdy via Unicode wrote: > If your fonts behave incorrectly on your system because it does not > map any glyph for NNBSP, don't blame the font or Unicode about this > problem, blame the renderer (or the application or OS using it, may > be they are

Re: A last missing link for interoperable representation

2019-01-14 Thread Richard Wordingham via Unicode
On Mon, 14 Jan 2019 16:02:05 -0800 Asmus Freytag via Unicode wrote: > On 1/14/2019 3:37 PM, Richard Wordingham via Unicode wrote: > On Tue, 15 Jan 2019 00:02:49 +0100 > Hans Åberg via Unicode wrote: > > On 14 Jan 2019, at 23:43, James Kass via Unicode > wrote: >

Re: A last missing link for interoperable representation

2019-01-14 Thread Richard Wordingham via Unicode
On Mon, 14 Jan 2019 06:24:46 + James Kass via Unicode wrote: > Unicode doesn't enforce any spelling or punctuation rules.  Unicode > doesn't tell human beings how to pronounce strings of text or how to > interpret them. These are not statements that are both honest and true. Unicode lays

Re: A last missing link for interoperable representation

2019-01-14 Thread Richard Wordingham via Unicode
On Tue, 15 Jan 2019 00:02:49 +0100 Hans Åberg via Unicode wrote: > > On 14 Jan 2019, at 23:43, James Kass via Unicode > > wrote: > > > > Hans Åberg wrote, > > > > > How about using U+0301 COMBINING ACUTE ACCENT: 푝푎푠푠푒́ > > > > Thought about using a combining accent. Figured it would

Re: A last missing link for interoperable representation

2019-01-14 Thread Richard Wordingham via Unicode
On Mon, 14 Jan 2019 07:47:45 + (GMT) Julian Bradfield via Unicode wrote: > On 2019-01-13, James Kass via Unicode wrote: > > यदि आप किसी रोटरी फोन से कॉल कर रहे हैं, तो कृपया स्टार (*) दबाएं। > > > What happens with Devanagari text?  Should the user community > > refrain from

Re: A last missing link for interoperable representation

2019-01-12 Thread Richard Wordingham via Unicode
On Sat, 12 Jan 2019 14:21:19 + James Kass via Unicode wrote: > FWIW, the math formula: > a + b # 푏 + 푎 > ... becomes invalid if normalized NFKD/NFKC.  (Or if copy/pasted from > an HTML page using marked-up ASCII into a plain-text editor.) (a) Italic versus plain is not significant in the

Re: A last missing link for interoperable representation

2019-01-10 Thread Richard Wordingham via Unicode
On Thu, 10 Jan 2019 23:43:46 + James Kass via Unicode wrote: > The second step would be to persuade Unicode to encode a new > character rather than simply using an existing variation selector > character to do the job. Actually, this might be a superior option. Richard.

Arranging Hieroglyphics (was: A sign/abbreviation for "magister")

2018-11-04 Thread Richard Wordingham via Unicode
On Sat, 3 Nov 2018 22:55:17 +0100 Philippe Verdy via Unicode wrote: > I can also cite the case of Egyptian hieroglyphs: there's still no > way to render them correctly because we lack the development of a > stable orthography that would drive the encoding of the missing > **semantic** characters

Re: UCA unnecessary collation weight 0000

2018-11-02 Thread Richard Wordingham via Unicode
On Fri, 2 Nov 2018 14:27:37 -0700 Ken Whistler via Unicode wrote: > On 11/2/2018 10:02 AM, Philippe Verdy via Unicode wrote: > > UTR#10 still does not explicitly state that its use of "" does > > not mean it is a valid "weight", it's a notation only > > No, it is explicitly a valid

Re: use vs mention (was: second attempt)

2018-11-02 Thread Richard Wordingham via Unicode
On Thu, 1 Nov 2018 07:46:40 + Richard Wordingham via Unicode wrote: > On Wed, 31 Oct 2018 23:35:06 +0100 > Piotr Karocki via Unicode wrote: > > > These are only examples of changes in meaning with or , > > not all of these examples can really exist - but, then, anot

Re: A sign/abbreviation for "magister"

2018-11-02 Thread Richard Wordingham via Unicode
On Fri, 02 Nov 2018 08:38:45 -0700 Doug Ewell via Unicode wrote: > Do we have any other evidence of this usage, besides a single > handwritten postcard? What, beyond some of us actually employing it ourselves? I'm sure I've seen 'William' abbreviated in print to 'Wᵐ' with some mark below, but

Re: UCA unnecessary collation weight 0000

2018-11-02 Thread Richard Wordingham via Unicode
On Fri, 2 Nov 2018 14:54:19 +0100 Philippe Verdy via Unicode wrote: > It's not just a question of "I like it or not". But the fact that the > standard makes the presence of required in some steps, and the > requirement is in fact wrong: this is in fact NEVER required to > create an

Re: A sign/abbreviation for "magister"

2018-11-01 Thread Richard Wordingham via Unicode
On Thu, 01 Nov 2018 18:23:05 +0100 "Janusz S. Bień via Unicode" wrote: > On Thu, Nov 01 2018 at 8:43 -0700, Asmus Freytag via Unicode wrote: > > I don't think it's a joke to recognize that there is a continuum > > here and that there is no line that can be drawn which is based on > >

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Richard Wordingham via Unicode
On Thu, 1 Nov 2018 18:39:16 +0100 Philippe Verdy via Unicode wrote: > What this means is that we can safely implement UCA using basic > substitions (e.g. with a function like "string:gsub(map)" in Lua > which uses a "map" to map source (binary) strings or regexps,into > target (binary) strings:

Re: UCA unnecessary collation weight 0000

2018-11-01 Thread Richard Wordingham via Unicode
On Thu, 1 Nov 2018 21:13:46 +0100 Philippe Verdy via Unicode wrote: > I'm not speaking just about how collation keys will finally be stored > (as uint16 or bytes, or sequences of bits with variable length); I'm > just refering to the sequence of weights you generate. > You absolutely NEVER

<    1   2   3   4   5   >