On Thu, 1 Nov 2018 18:39:16 +0100
Philippe Verdy via Unicode wrote:
> What this means is that we can safely implement UCA using basic
> substitions (e.g. with a function like "string:gsub(map)" in Lua
> which uses a "map" to map source (binary) strings or regexps,into
> target (binary) strings:
>
On Thu, 01 Nov 2018 18:23:05 +0100
"Janusz S. Bień via Unicode" wrote:
> On Thu, Nov 01 2018 at 8:43 -0700, Asmus Freytag via Unicode wrote:
> > I don't think it's a joke to recognize that there is a continuum
> > here and that there is no line that can be drawn which is based on
> > straightfo
On Fri, 2 Nov 2018 14:54:19 +0100
Philippe Verdy via Unicode wrote:
> It's not just a question of "I like it or not". But the fact that the
> standard makes the presence of required in some steps, and the
> requirement is in fact wrong: this is in fact NEVER required to
> create an equivalen
On Fri, 02 Nov 2018 08:38:45 -0700
Doug Ewell via Unicode wrote:
> Do we have any other evidence of this usage, besides a single
> handwritten postcard?
What, beyond some of us actually employing it ourselves? I'm sure I've
seen 'William' abbreviated in print to 'Wᵐ' with some mark below, but
On Thu, 1 Nov 2018 07:46:40 +
Richard Wordingham via Unicode wrote:
> On Wed, 31 Oct 2018 23:35:06 +0100
> Piotr Karocki via Unicode wrote:
>
> > These are only examples of changes in meaning with or ,
> > not all of these examples can really exist - but, then, anot
On Fri, 2 Nov 2018 14:27:37 -0700
Ken Whistler via Unicode wrote:
> On 11/2/2018 10:02 AM, Philippe Verdy via Unicode wrote:
> > UTR#10 still does not explicitly state that its use of "" does
> > not mean it is a valid "weight", it's a notation only
>
> No, it is explicitly a valid weight
On Sat, 3 Nov 2018 22:55:17 +0100
Philippe Verdy via Unicode wrote:
> I can also cite the case of Egyptian hieroglyphs: there's still no
> way to render them correctly because we lack the development of a
> stable orthography that would drive the encoding of the missing
> **semantic** characters
On Thu, 10 Jan 2019 23:43:46 +
James Kass via Unicode wrote:
> The second step would be to persuade Unicode to encode a new
> character rather than simply using an existing variation selector
> character to do the job.
Actually, this might be a superior option.
Richard.
On Sat, 12 Jan 2019 10:57:26 + (GMT)
Julian Bradfield via Unicode wrote:
> It's also fundamentally misguided. When I _italicize_ a word, I am
> writing a word composed of (plain old) letters, and then styling the
> word; I am not composing a new and different word ("_italicize_") that
> is di
On Sat, 12 Jan 2019 14:21:19 +
James Kass via Unicode wrote:
> FWIW, the math formula:
> a + b # 𝑏 + 𝑎
> ... becomes invalid if normalized NFKD/NFKC. (Or if copy/pasted from
> an HTML page using marked-up ASCII into a plain-text editor.)
(a) Italic versus plain is not significant in the mat
On Mon, 14 Jan 2019 07:47:45 + (GMT)
Julian Bradfield via Unicode wrote:
> On 2019-01-13, James Kass via Unicode wrote:
> > यदि आप किसी रोटरी फोन से कॉल कर रहे हैं, तो कृपया स्टार (*) दबाएं।
>
> > What happens with Devanagari text? Should the user community
> > refrain from interchanging
On Tue, 15 Jan 2019 00:02:49 +0100
Hans Åberg via Unicode wrote:
> > On 14 Jan 2019, at 23:43, James Kass via Unicode
> > wrote:
> >
> > Hans Åberg wrote,
> >
> > > How about using U+0301 COMBINING ACUTE ACCENT: 𝑝𝑎𝑠𝑠𝑒́
> >
> > Thought about using a combining accent. Figured it would just
On Mon, 14 Jan 2019 06:24:46 +
James Kass via Unicode wrote:
> Unicode doesn't enforce any spelling or punctuation rules. Unicode
> doesn't tell human beings how to pronounce strings of text or how to
> interpret them.
These are not statements that are both honest and true. Unicode lays
On Mon, 14 Jan 2019 16:02:05 -0800
Asmus Freytag via Unicode wrote:
> On 1/14/2019 3:37 PM, Richard Wordingham via Unicode wrote:
> On Tue, 15 Jan 2019 00:02:49 +0100
> Hans Åberg via Unicode wrote:
>
> On 14 Jan 2019, at 23:43, James Kass via Unicode
> wrote:
>
On Tue, 15 Jan 2019 13:25:06 +0100
Philippe Verdy via Unicode wrote:
> If your fonts behave incorrectly on your system because it does not
> map any glyph for NNBSP, don't blame the font or Unicode about this
> problem, blame the renderer (or the application or OS using it, may
> be they are very
On Thu, 17 Jan 2019 04:51:57 +0100
Marcel Schneider via Unicode wrote:
> Also, at least one French typographer was extremely upset
> about Unicode not gathering feedback from typographers.
> That blame is partly wrong since at least one typographer
> was and still is present in WG2, and even if n
On Thu, 17 Jan 2019 18:35:49 +0100
Marcel Schneider via Unicode wrote:
> Among the grievances, Unicode is blamed for confusing Greek psili and
> dasia with comma shapes, and for misinterpreting Latin letter forms
> such as the u with descender taken for a turned h, and double u
> mistaken for a
On Thu, 17 Jan 2019 18:44:50 -0500
"J. S. Choi" via Unicode wrote:
> I’m implementing a Unicode names library. I’m confused about loose
> character-name matching, even after rereading The Unicode Standard §
> 4.8, UAX #34 § 4, #44 § 5.9.2 – as well as
> [L2/13-142](http://www.unicode.org/L2/L2013
On Fri, 18 Jan 2019 10:20:22 -0800
Asmus Freytag via Unicode wrote:
> However, if there's a consensus interpretation of a given character
> the you can't just go in and change it, even if it would make that
> character work "better" for a given circumstance: you simply don't
> know (unless you re
On Fri, 18 Jan 2019 10:51:18 -0500
"Mark E. Shoulson via Unicode" wrote:
> On 1/16/19 6:23 AM, Victor Gaultney via Unicode wrote:
> >
> > Encoding 'begin italic' and 'end italic' would introduce
> > difficulties when partial strings are moved, etc. But that's no
> > different than with current pu
On Sun, 20 Jan 2019 03:14:21 +
James Kass via Unicode wrote:
> (In the event that a persuasive proposal presentation prompts the
> possibility of italics encoding...)
The use of italic script isn't just restricted to the Latin script,
which includes base characters not supported by the math
On Mon, 21 Jan 2019 00:29:42 -0800
David Starner via Unicode wrote:
> The superscripts show a problem with multiple encoding; even if you
> think they should be Unicode superscripts, and they look like Unicode
> superscripts, they might be HTML superscripts. Same thing would happen
> with italics
On Thu, 24 Jan 2019 18:24:07 +0200
Khaled Hosny via Unicode wrote:
> On Thu, Jan 24, 2019 at 03:54:29PM +, Andrew West via Unicode
> wrote:
>> On Thu, 24 Jan 2019 at 15:42, James Kass
>> wrote:
>>> Going off topic a little, I saw this tweet from Marijn van Putten
>>> today which shows exa
On Fri, 25 Jan 2019 12:39:47 -0500
James Tauber via Unicode wrote:
> Thank you, although the word break does still affect things like
> double-clicking to select.
>
> And people do seem to want to use U+02BC for this reason (and I'm
> trying to articulate why that isn't what U+02BC is meant for)
On Fri, 25 Jan 2019 17:02:25 -0500
James Tauber via Unicode wrote:
> I guess U+02BC is category Lm not Mn, but doesn't that still mean it
> modifies the previous character (i.e. is really part of the same
> grapheme cluster) and so isn't appropriate as either a vowel or an
> indication of an omit
On Sat, 26 Jan 2019 15:45:54 +
James Kass via Unicode wrote:
> Perhaps I'm not understanding, but if the desired behavior is to
> prohibit both line and word breaks in the example string, then...
>
> In Notepad, replacing U+0020 with U+00A0 removes the line-break.
I believe the problem is
On Sun, 27 Jan 2019 00:32:43 +
Michael Everson via Unicode wrote:
> I’ll be publishing a translation of Alice into Ancient Greek in due
> course. I will absolutely only use U+2019 for the apostrophe. It
> would be wrong for lots of reasons to use U+02BC for this.
Please list them.
Will your
On Sat, 26 Jan 2019 17:11:49 -0800
Asmus Freytag via Unicode wrote:
> To make matters worse, users for languages that "should" use U+02BC
> aren't actually consistent; much data uses U+2019 or U+0027. Ordinary
> users can't tell the difference (and spell checkers seem not
> successful in enforcin
On Sun, 27 Jan 2019 01:55:29 +
James Kass via Unicode wrote:
> Richard Wordingham replied to Asmus Freytag,
>
> >> To make matters worse, users for languages that "should" use
> >> U+02BC aren't actually consistent; much data uses U+2019 or
>
On Sat, 26 Jan 2019 21:11:36 -0800
Asmus Freytag via Unicode wrote:
> On 1/26/2019 5:43 PM, Richard Wordingham via Unicode wrote:
>> That appears to contradict Michael Everson's remark about a
>> Polynesian
>> need to distinguish the two visually.
> Why do you need
On Sun, 27 Jan 2019 12:38:39 -0500
"Mark E. Shoulson via Unicode" wrote:
> On 1/27/19 11:08 AM, Michael Everson via Unicode wrote:
> > It is a letter. In “can’t” the apostrophe isn’t a letter. It’s a
> > mark of elision. I can double-click on the three words in this
> > paragraph which have the
On Sun, 27 Jan 2019 16:11:12 +
Michael Everson via Unicode wrote:
> Yes, yes. It doesn’t matter. The discussion applies to both the two
> quotation marks and the two modifier letters.
Actually, there is a difference. As the ʻokina doesnʹt occur at the
end of a word in Hawaiian, one only str
On Sun, 27 Jan 2019 14:09:31 -0500
James Tauber via Unicode wrote:
> On Sun, Jan 27, 2019 at 1:22 PM Richard Wordingham via Unicode <
> unicode@unicode.org> wrote:
> > However LibreOffice treats "don't" as a single word for U+0027,
> > U+02BC and U+2019
On Sun, 27 Jan 2019 19:57:37 +
James Kass via Unicode wrote:
> On 2019-01-27 7:09 PM, James Tauber via Unicode wrote:
> > In my original post, I asked if a language-specific tailoring of
> > the text segmentation algorithm was the solution but no one here
> > has agreed so far.
> If there a
On Mon, 28 Jan 2019 03:48:52 +
James Kass via Unicode wrote:
> It’s been said that the text segmentation rules seem over-complicated
> and are probably non-trivial to implement properly. I tried your
> suggestion of WORD JOINER U+2060 after tau ( γένοιτ’ ἄν ), but it
> only added yet anot
On Mon, 28 Jan 2019 08:31:40 +0100
Mark Davis ☕️ via Unicode wrote:
> But the question is how important those are in daily life. I'm not
> sure why the double-click selection behavior is so much more of a
> problem for Ancient Greek users than it is for the somewhat larger
> community of English u
On Mon, 28 Jan 2019 21:10:19 -0500
"Mark E. Shoulson via Unicode" wrote:
> On 1/28/19 3:58 PM, Richard Wordingham via Unicode wrote:
> > Interestingly, bringing this word breaker into line with TUS in the
> > UK may well be in breach of the Equality Act 2010.
> >
&
On Mon, 28 Jan 2019 20:55:39 -0500
"Mark E. Shoulson via Unicode" wrote:
> On 1/28/19 2:31 AM, Mark Davis ☕️ via Unicode wrote:
> >
> > But the question is how important those are in daily life. I'm not
> > sure why the double-click selection behavior is so much more of a
> > problem for Ancien
On Wed, 30 Jan 2019 15:33:38 +0100
Frédéric Grosshans via Unicode wrote:
> Le 30/01/2019 à 14:36, Egmont Koblinger via Unicode a écrit :
> > - It doesn't do Arabic shaping. In my recommendation I'm arguing
> > that in this mode, where shuffling the characters is the task of
> > the text editor an
On Wed, 30 Jan 2019 20:35:36 -0500
"Mark E. Shoulson via Unicode" wrote:
> On 1/30/19 8:58 AM, Egmont Koblinger via Unicode wrote:
> > There's another side to the entire BiDi story, though. Simple
> > utilities like "echo", "cat", "ls", "grep" and so on, line editing
> > experience of your shell,
On Thu, 31 Jan 2019 08:28:41 +
Martin J. Dürst via Unicode wrote:
> > Basic Arabic shaping, at the level of a typewriter, is
> > straightforward enough to leave to a terminal emulator, as Eli has
> > suggested. Lam-alif would be trickier - one cell or two?
>
> Same for other characters. A
On Thu, 31 Jan 2019 12:46:48 +0100
Egmont Koblinger wrote:
> Hi Richard,
>
> > Basic Arabic shaping, at the level of a typewriter, is
> > straightforward enough to leave to a terminal emulator, as Eli has
> > suggested.
>
> What is "basic" Arabic shaping exactly?
Just using initial, medial a
On Fri, 1 Feb 2019 13:02:45 +0200
Khaled Hosny via Unicode wrote:
> On Thu, Jan 31, 2019 at 11:17:19PM +0000, Richard Wordingham via
> Unicode wrote:
> > On Thu, 31 Jan 2019 12:46:48 +0100
> > Egmont Koblinger wrote:
> >
> > No. How many cells do CJK ideograp
On Fri, 1 Feb 2019 14:47:22 +0100
Egmont Koblinger via Unicode wrote:
> Hi Ken,
>
> > [language tag]
> > That is a complete non-starter for the Unicode Standard.
>
> Thanks for your input!
>
> (I hope it was clear that I just started throwing in random ideas, as
> in a brainstorming session.
On Sat, 02 Feb 2019 00:38:04 +0100
Kent Karlsson via Unicode wrote:
> Den 2019-02-01 19:57, skrev "Richard Wordingham via Unicode"
> :
> "Monospaced font" is really a concept with modification. Even for
> "plain old ASCII" there are two advance widths
On Fri, 01 Feb 2019 15:18:13 -0700
Doug Ewell via Unicode wrote:
> Richard Wordingham wrote:
>
> > Language tagging is already available in Unicode, via the tag
> > characters in the deprecated plane.
>
> Plane 14 isn't deprecated -- that isn't a p
On Fri, 1 Feb 2019 15:15:53 +0100
Egmont Koblinger via Unicode wrote:
> Hi Richard,
>
> On Fri, Feb 1, 2019 at 12:19 AM Richard Wordingham via Unicode
> wrote:
>
> > Cropped why? If the problem is the truncation of lines, one can
> > simple store the next character
On Fri, 1 Feb 2019 15:15:53 +0100
Egmont Koblinger via Unicode wrote:
> Hi Richard,
>
> On Fri, Feb 1, 2019 at 12:19 AM Richard Wordingham via Unicode
> wrote:
>
> > Cropped why? If the problem is the truncation of lines, one can
> > simple store the next character
On Sat, 02 Feb 2019 14:01:46 +0100
Kent Karlsson via Unicode wrote:
> Den 2019-02-02 12:17, skrev "Egmont Koblinger" :
> > Most terminal emulators handle non-spacing combining marks, it's a
> > piece of cake. (Spacing marks are more problematic.)
> Well, I guess you may need to put some (prac
On Sat, 2 Feb 2019 13:18:03 +0100
Egmont Koblinger via Unicode wrote:
> Hi Richard,
>
> On Sat, Feb 2, 2019 at 12:43 PM Richard Wordingham via Unicode
> wrote:
>
> > I'm not conversant with the details of terminal controls and I
> > haven't used fields.
On Sat, 2 Feb 2019 12:54:16 +0100
Egmont Koblinger via Unicode wrote:
> Hi Richard,
>
> > > Are they okay to be present in visual order (the terminal's
> > > explicit mode, what we're discussing now) too?
> >
> > Where do you define the order for explicit mode?
>
> In explicit mode, the app
On Sat, 02 Feb 2019 20:58:06 +0100
Benjamin Riefenstahl via Unicode wrote:
> Hi Egmont, hi all,
>
>
> This is a interesting discussion here. If only because I would have
> thought that there is only minimal interest by the actual target
> audience in supporting these scripts in a terminal, giv
On Sat, 2 Feb 2019 23:02:10 +0100
Egmont Koblinger via Unicode wrote:
> Hi Richard,
>
> On Sat, Feb 2, 2019 at 9:57 PM Richard Wordingham
> wrote:
>
> > Seriously, you need to give a definition of 'visual order' for this
> > context. Not everyone shar
On Sun, 03 Feb 2019 02:01:18 +0100
Kent Karlsson via Unicode wrote:
> Den 2019-02-02 16:12, skrev "Richard Wordingham via Unicode"
> :
> > Doesn't Jerusalem in biblical Hebrew sometime have 3 marks below the
> > lamedh? The depth then is the maximum dep
On Sun, 03 Feb 2019 18:14:53 +0200
Eli Zaretskii via Unicode wrote:
> > Date: Sun, 3 Feb 2019 02:43:06 +
> > Cc: Kent Karlsson
> > From: Richard Wordingham via Unicode
> >
> > So, what do you recommend I run grep from for Hebrew or Tai Lue?
>
> In
On Sun, 03 Feb 2019 18:05:49 +0200
Eli Zaretskii via Unicode wrote:
> > Date: Sat, 2 Feb 2019 21:49:40 +
> > From: Richard Wordingham via Unicode
> >
> > Eli will probably tell me I'm behind the times, but there are a few
> > places where a Gnome-terminal
On Sun, 03 Feb 2019 20:07:51 +0200
Eli Zaretskii via Unicode wrote:
> > Date: Sun, 3 Feb 2019 17:45:06 +
> > From: Richard Wordingham via Unicode
> >
> > > > So, what do you recommend I run grep from for Hebrew or Tai
> > > > Lue?
> >
On Sun, 03 Feb 2019 18:13:06 +0200
Eli Zaretskii via Unicode wrote:
> Actually, you pass the characters to be shaped in logical order, and
> then display the produced grapheme clusters in visual order.
Some early systems supporting computerised Hebrew script did pass
characters in left-to-right
On Sun, 03 Feb 2019 19:50:50 +0200
Eli Zaretskii via Unicode wrote:
> Do you see how this is carefully formatted to avoid overflowing an
> 80-column line of a typical terminal? Now suppose this is translated
> into a RTL language, which causes the Copyright line to start with a
> strong R letter
On Mon, 4 Feb 2019 00:36:23 +0100
Egmont Koblinger via Unicode wrote:
> I wish to store and deliver the following text, as it's layed out here
> in logical order. That is, the order as the bytes appear in the text
> file, as I typed them from the keyboard, is laid out here strictly
> from left to
On Mon, 4 Feb 2019 00:36:23 +0100
Egmont Koblinger via Unicode wrote:
> Now, back to terminals.
>
> The smallest possible viable definition of a "paragraph" in terminal
> emulators is stuff between one newline and the next one.
>
> It would require a hell lot of work, redesigning (overcomplicat
On Mon, 04 Feb 2019 18:53:22 +0200
Eli Zaretskii via Unicode wrote:
> Date: Mon, 4 Feb 2019 01:19:21 +
> From: Richard Wordingham via Unicode
>> If you look at it in Notepad, all
>> lines will be LTR or all lines will be RTL.
> That's because Notepad implemen
On Sun, 03 Feb 2019 18:03:37 +0200
Eli Zaretskii via Unicode wrote:
> > Date: Sun, 3 Feb 2019 03:02:13 +0100
> > Cc: unicode@unicode.org
> > From: Egmont Koblinger via Unicode
> >
> > > All I am saying is that your proposal should define what it means
> > > by visual order.
> >
> > Are you
On Sun, 3 Feb 2019 20:50:03 +
Richard Wordingham via Unicode wrote:
> On Sun, 03 Feb 2019 20:07:51 +0200
> Eli Zaretskii via Unicode wrote:
> Which is why I try to remember to issue the emacs command 'M-x shell'
> command and issue grep commands from the buffer
On Mon, 04 Feb 2019 22:39:07 +0200
Eli Zaretskii via Unicode wrote:
> > Date: Mon, 4 Feb 2019 19:45:13 +
> > From: Richard Wordingham via Unicode
> >
> > Yes. If one has a text composed of LTR and RTL paragraphs, one has
> > to choose how far apart their
On Tue, 5 Feb 2019 00:08:10 +0100
Egmont Koblinger via Unicode wrote:
> Hi Eli,
>
> > Actually, UAX#9 defines "paragraph" as the chunk of text delimited
> > by paragraph separator characters. This means characters whose bidi
> > category is B, which includes Newline, the CR-LF pair on Windows,
On Mon, 4 Feb 2019 22:27:39 +0100
Egmont Koblinger via Unicode wrote:
> Hi Richard,
>
> > The concept appears to exist in the form of the fields of the
> > fifth edition of ECMA-48. Have you digested this ambitious
> > standard?
>
> To be honest: No, I haven't. And I have no idea what those
On Tue, 5 Feb 2019 16:01:41 +
Andrew West via Unicode wrote:
> You would
> have to first convert any text to be italicized to NFD, then apply
> VS14 to each non-combining character. This alone would make a VS
> solution unacceptable in my opinion.
What is so unacceptable about having to do t
On Wed, 6 Feb 2019 22:01:59 +0100
Egmont Koblinger via Unicode wrote:
> Hi Eli,
>
> (I'm getting lost where to reply, and how the subject gets mangled and
> the thread split into different ones.)
>
>
> I've thought about it a lot, experimented with Emacs's behavior, and
> I've arrived at the c
On Thu, 7 Feb 2019 00:45:55 +0100
Egmont Koblinger via Unicode wrote:
> Hi Richard,
>
> > Not necessarily. One could allow the first strong character in the
> > prompt to determine the paragraph directions
>
> How does Emacs know what's a prompt? How can it tell it from the
> previous and ne
On Thu, 07 Feb 2019 22:00:20 +0200
Eli Zaretskii via Unicode wrote:
> > From: Egmont Koblinger
> > Date: Thu, 7 Feb 2019 19:01:33 +0100
> > On Thu, Feb 7, 2019 at 6:53 PM Eli Zaretskii wrote:
> > > No, it needs no interaction. Unless the regexp doesn't work for
> > > you, which you should th
On Fri, 8 Feb 2019 00:38:24 +0100
Egmont Koblinger via Unicode wrote:
> I, for one, am not to the slightest bit interested in abandoning the
> character grid and allowing for proportional fonts. This would just
> break a gazillion of things.
The message I take from that and this thread in genera
On Fri, 08 Feb 2019 15:45:15 +0200
Eli Zaretskii via Unicode wrote:
> > From: Egmont Koblinger
> > Date: Fri, 8 Feb 2019 13:30:42 +0100
> > Cc: Richard Wordingham ,
> > unicode Unicode Discussion
> >
> > Hi Eli,
> >
> > > Not sure
On Fri, 08 Feb 2019 11:34:29 +0200
Eli Zaretskii via Unicode wrote:
> > Date: Fri, 8 Feb 2019 06:40:44 +
> > From: Richard Wordingham via Unicode
> >
> > > I, for one, am not to the slightest bit interested in abandoning
> > > the character grid and a
On Fri, 8 Feb 2019 17:16:09 + (GMT)
"wjgo_10...@btinternet.com via Unicode" wrote:
> Andrew West wrote:
>> Just reminding you that "The initial character in a variation
>> sequence
>> is never a nonspacing combining mark (gc=Mn) or a canonical
>> decomposable character" (The Unicode Standa
On Fri, 8 Feb 2019 22:29:57 +0100
Egmont Koblinger via Unicode wrote:
> Some terminal emulators have made up some new SGR modes, e.g. ESC[4:3m
> for curly underline. What to do with them? Where to draw the line what
> to add to Unicode and what not to? Will Unicode possibly be a
> bottleneck of f
On Sat, 09 Feb 2019 00:16:30 +0200
Eli Zaretskii via Unicode wrote:
> > Date: Fri, 8 Feb 2019 21:55:58 +
> > From: Richard Wordingham via Unicode
> > I will give a concrete application. If I want to make a font that
> > is interpretable for Tai Tham and maximal
On Fri, 8 Feb 2019 14:26:28 -0800
Asmus Freytag via Unicode wrote:
> On 2/8/2019 2:08 PM, Richard Wordingham via Unicode wrote:
> On Fri, 8 Feb 2019 17:16:09 + (GMT)
> "wjgo_10...@btinternet.com via Unicode" wrote:
>
> Andrew West wrote:
>
> Just reminding
On Fri, 8 Feb 2019 18:08:34 -0800
Asmus Freytag via Unicode wrote:
> On 2/8/2019 5:42 PM, James Kass via Unicode wrote:
> You are still making the assumption that selecting a different glyph
> for the base character would automatically lead to the selection of a
> different glyph for the combin
On Sat, 09 Feb 2019 09:42:09 +0200
Eli Zaretskii via Unicode wrote:
> > Date: Sat, 9 Feb 2019 00:18:14 +
> > From: Richard Wordingham via Unicode
> >
> > > For character composition, you must have a shaping engine to talk
> > > to, and the shap
On Sat, 9 Feb 2019 04:52:30 -0800
David Starner via Unicode wrote:
> Note that this is actually the only thing that stands out to me in
> Unicode not supporting older character sets; in PETSCII (Commodore
> 64), the high-bit character characters were the reverse (in this
> sense) of the low-bit c
On Sat, 9 Feb 2019 13:02:55 -0800
"Asmus Freytag \(c\) via Unicode" wrote:
> To force Hindi crosswords mode you need to segment the string into
> syllables,
> each having a variable number of characters, and then assign a single
> display
> position to them. Now some syllables are wider than ot
On Sat, 9 Feb 2019 22:31:37 +0100
Egmont Koblinger via Unicode wrote:
> Let's take the Devanagari improvement of the other day. Until now,
> there were plenty of dotted circles shown, and combining spacing marks
> that should've been placed before the letter were placed after the
> letter, before
On Sat, 9 Feb 2019 22:29:31 +0100
Adam Borowski via Unicode wrote:
> On Sat, Feb 09, 2019 at 10:01:21PM +0200, Eli Zaretskii via Unicode
> wrote:
> > I don't know. Maybe it keeps a database of character combinations
> > that need shaping, each one with the maximum width on display the
> > resul
On Sat, 9 Feb 2019 18:42:52 +0100
Egmont Koblinger via Unicode wrote:
> The
> problem that I don't know how to address is: What if harfbuzz tells us
> that the overall width for rendering a particular grapheme cluster is
> significantly different from its designated area (the number of
> charact
On Sun, 10 Feb 2019 00:59:46 +0100
Egmont Koblinger via Unicode wrote:
> Is there such a monospace font obeying wcwidth (that is: double wide
> character for when a spacing mark is combined) for Devanagari?
For CV, that would correspond to a Hindi typewriter, so the odds look
good. The Remington
On Sun, 10 Feb 2019 14:54:39 +0100
Philippe Verdy via Unicode wrote:
> Le sam. 9 févr. 2019 à 20:55, Egmont Koblinger via Unicode <
> unicode@unicode.org> a écrit :
>
> > Hi Asmus,
> >
> > > On quick reading this appears to be a strong argument why such
> > > emulators
> > will
> > > nev
On Tue, 12 Feb 2019 13:50:00 +0100
Egmont Koblinger via Unicode wrote:
> For
> starter, I'd love to see a shell with interactive line editing (like
> bash, zsh),...
Bash already seems to handle proportional fonts quite well when run
under Emacs 'M-x shell', which is more than can be said for bas
Where can I find the InSc properties of characters as overridden for
the USE of Windows?
I am trying to work out why on MS Edge I am now getting dotted circles
before U+1A7A TAI THAM SIGN RA HAAM in all of:
ᩆᩢᨠ᩠ᨯᩥ᩺ rank /sak/ ,
ᨾᩉᩣᩉᩥᨦ᩠ᨣᩩ᩺ giant fennel /ma haː hiŋ/
and
ᩆᩣᩈ᩠ᨲᩕ᩺ science /saːt/ ?
On Fri, 22 Feb 2019 09:07:06 +
Richard Wordingham via Unicode wrote:
> My best hypothesis (not thoroughly tested) is that Windows currently
> has InSc=Consonant_Killer, but can I look his up as opposed to
> effectively devising a test suite for USE on Office?
That question's
On Fri, 22 Feb 2019 22:19:25 +
Andrew Glass wrote:
> Thank you Richard for pointing out the issue with 0x1A7A
> I've looked into this and found an error in our tooling that has this
> mapped this to Halant. Based on the spec this should be VAbv. I've
> filed a bug.
Thanks. Will the correcti
On Sat, 23 Feb 2019 14:46:27 +0800
梁海 Liang Hai via Unicode wrote:
> >>> once the USE acknowledges that subjoined consonants may follow
> >>> vowels
> >>
> >> I expect to update the USE spec to address this soon.
> >
> > That seems welcome news. I still don't know what the problem with
>
On Sat, 23 Feb 2019 14:46:27 +0800
梁海 Liang Hai via Unicode wrote:
> >>> once the USE acknowledges that subjoined consonants may follow
> >>> vowels
> >>
> >> I expect to update the USE spec to address this soon.
> >
> > That seems welcome news. I still don't know what the problem with
>
On Sat, 23 Feb 2019 14:46:27 +0800
梁海 Liang Hai via Unicode wrote:
> USE wasn’t designed to allow such a syllable structure. Tai Tham’s
> being supported by USE is kind of an oversight. And although it’s
> appropriate to allow conjoined consonants to follow post-base-spacing
> vowel signs,
There
Is there any reason why U+0310 COMBINING CANDRABINDU has scx=Inherited
rather than scx=Latn? The only language I've seen the character used
in is Sanskrit, and the only script I've seen it in is the Latin
script.
Richard.
Which character should one use for a danda in the Latin script? I
believed normal usage is to use U+0964 DEVANAGARI DANDA, but for some
reason its script extension property does not include the Latin script.
Richard.
On Fri, 19 Apr 2019 01:52:15 +0200
Marius Spix via Unicode wrote:
> The Wikipedia page states, U+0310 is a general-purpose combining
> diacritical mark. I would treat it similar like U+0308 (COMBINING
> DIAERESIS) or U+030C (COMBINING CARON), which are both characters with
> multiple names and di
On Fri, 19 Apr 2019 11:36:16 +0530
Shriramana Sharma via Unicode wrote:
> On 4/19/19, Richard Wordingham via Unicode
> wrote:
> > That's a fair point. My problem is that someone is claiming of
> > U+0310 that "Somewhere in the Unicode specifications is a footnote
On Fri, 19 Apr 2019 19:54:47 +0530
Shriramana Sharma wrote:
> Or maybe the Grantha candrabindu can be used, since there is already
> evidence for mixed usage of the scripts and nukta characters have been
> encoded for Tamil usage in the Grantha block for this same reason
> despite Grantha users o
Begin forwarded message:
Date: Fri, 19 Apr 2019 11:30:32 +0100
From: Richard Wordingham
To: Shriramana Sharma
Subject: Re: Latin Script Danda
On Fri, 19 Apr 2019 11:33:35 +0530
Shriramana Sharma via Unicode wrote:
> We are using the pipe character as it is readily available in
901 - 1000 of 1101 matches
Mail list logo