> Date: Wed, 26 Apr 2017 07:45:07 +0100
> From: Richard Wordingham via Unicode
>
> On Wed, 26 Apr 2017 08:48:13 +0300
> Eli Zaretskii via Unicode wrote:
>
> > > Date: Sun, 23 Apr 2017 22:59:49 +0100
> > > From: Richard Wordingham
On Wed, 26 Apr 2017 08:48:13 +0300
Eli Zaretskii via Unicode wrote:
> > Date: Sun, 23 Apr 2017 22:59:49 +0100
> > From: Richard Wordingham
> > Cc: Eli Zaretskii
> >
> > If I search for CGJ, highlighting it is frequently
> Date: Sun, 23 Apr 2017 22:59:49 +0100
> From: Richard Wordingham
> Cc: Eli Zaretskii
>
> If I search for CGJ, highlighting it is frequently supremely useless.
> I want to know where it is; highlighting is merely a tool to find it on
> the screen.
Quote from below:
The word indeed means 'danger' (Pali/Sanskrit _antarāya_). The
pronunciation is /ʔontʰalaːi/; the Tai languages that use(d) the Tai
Tham script no longer have /r/. The older sequence /tr/ normally
became /tʰ/ (except in Lao), but the spelling has not been updated - at
least,
On Mon, 24 Apr 2017 20:53:12 +0530
Naena Guru via Unicode wrote:
> Quote by Richard:
> Unless this implies a spelling reform for many languages, I'd like to
> see how this works for the Tai Tham script. I'm not happy with the
> Romanisation I use to work round hostile
Quote by Richard:
Unless this implies a spelling reform for many languages, I'd like to
see how this works for the Tai Tham script. I'm not happy with the
Romanisation I use to work round hostile rendering engines. (My
scheme is only documented in variable hack_ss02 in the last script
blocks of
On Mon, 24 Apr 2017 00:36:26 +0530
Naena Guru via Unicode wrote:
> The Unicode approach to Sanskrit and all Indic is flawed. Indic
> should not be letter-assembly systems.
>
> Sanskrit vyaakaraNa (grammar) explains the phonemes as the atoms of
> the speech. Each writing
On Sun, 23 Apr 2017 05:40:29 +0300
Eli Zaretskii via Unicode wrote:
> > The cursor moves to the cluster boundary, so there is much less of a
> > problem with Emacs.
>
> But you wanted to highlight only part of the cluster, AFAIU.
If I search for CGJ, highlighting it is
The Unicode approach to Sanskrit and all Indic is flawed. Indic should
not be letter-assembly systems.
Sanskrit vyaakaraNa (grammar) explains the phonemes as the atoms of the
speech. Each writing system then assigns a shape to the phonetically
precise phoneme.
The most technically and
On 4/22/2017 9:25 PM, Manish Goregaokar
via Unicode wrote:
Backspace in browsers (chrome and firefox) deletes within EGCs too.
They delete matras in devanagari, and jamos in hangul. They don't
*exactly* work off of code points (e.g. flag emoji gets deleted as a
> You cannot even
> meaningfully move by single characters in most clusters, because
> composing characters generally completely changes how the original
> characters looked, so there's nowhere you can display the cursor.
Yes, and this is one of the reasons it feels broken in devanagari, you
get
> Date: Sun, 23 Apr 2017 00:51:59 +0100
> Cc: Julian Bradfield
> From: Richard Wordingham via Unicode
>
> On Sat, 22 Apr 2017 21:39:42 +0100 (BST)
> Julian Bradfield via Unicode wrote:
>
> > On 2017-04-22, Eli Zaretskii via
On Sat, 22 Apr 2017 21:39:42 +0100 (BST)
Julian Bradfield via Unicode wrote:
> On 2017-04-22, Eli Zaretskii via Unicode wrote:
> > I could imagine Emacs decomposing characters temporarily when only
> > part of a cluster matches the search string.
On 2017-04-22, Eli Zaretskii via Unicode wrote:
>> From: Richard Wordingham via Unicode
[...]
>> I've encountered the problem that, while at least I can search for
>> text smaller than a cluster, there's no indication in the window of
>> where in the
> Date: Sat, 22 Apr 2017 17:13:36 +0100
> From: Richard Wordingham via Unicode
>
> > Movement by grapheme
> > cluster is AFAIK the most natural way of moving in complex scripts.
>
> Evidence?
Personal experience?
> It's easiest for displaying the cursor.
It's the _only_
On Sat, 22 Apr 2017 13:34:32 +0300
Eli Zaretskii via Unicode wrote:
> AFAIR, Emacs allows one to _delete_ individual characters,
> i.e. Backspace and C-d delete character-by-character, so the problem
> shouldn't be so grave for imperfect typists.
Deleting forwards by one
> Date: Sat, 22 Apr 2017 11:13:16 +0100
> From: Richard Wordingham via Unicode
>
> At present these are split into two and three grapheme clusters
> respectively, and LibreOffice cursor movement responds accordingly.
> (SIGN AA starts a grapheme cluster in several scripts of
On Fri, 21 Apr 2017 16:27:43 -0700
Manish Goregaokar via Unicode wrote:
> > Do Hindi speakers really think of orthographic syllables as
> > characters?
>
> When rendered as a cluster, yes? I've asked around, and folks seem to
> insist on coupling it to the rendering.
> Do Hindi speakers really think of orthographic syllables as characters?
When rendered as a cluster, yes? I've asked around, and folks seem to
insist on coupling it to the rendering. Given most fonts render
*normal* (common, etc) clusters, I think making them EGCs and looking
at nonrendered
On Thu, 20 Apr 2017 11:17:05 -0700
Manish Goregaokar via Unicode wrote:
> On Wed, Apr 19, 2017 at 4:35 PM, Richard Wordingham via Unicode
> wrote:
> > Is there consensus on how to count aksharas in the Devanagari
> > script? The doubts I have relate to
That seems like a relatively niche use case (especially with Vedic
Sanskrit) compared to having weird selection for everything else. I'm
not convinced. When I use a romanized Devanagari input method (I
typically do on my laptop), deleting the whole cluster is necessary
anyway for things to work
On Fri, 21 Apr 2017 00:08:24 -0500
Anshuman Pandey via Unicode wrote:
> > On Apr 20, 2017, at 8:19 PM, Richard Wordingham via Unicode
> > wrote:
> > Now imagine you're
> > typing Vedic Sanskrit, with its clusters and pitch indicators.
> I tried
> On Apr 20, 2017, at 8:19 PM, Richard Wordingham via Unicode
> wrote:
>
> On Thu, 20 Apr 2017 14:14:00 -0700
> Manish Goregaokar via Unicode wrote:
>
>> On Thu, Apr 20, 2017 at 12:14 PM, Richard Wordingham via Unicode
>> wrote:
On Thu, 20 Apr 2017 14:14:00 -0700
Manish Goregaokar via Unicode wrote:
> On Thu, Apr 20, 2017 at 12:14 PM, Richard Wordingham via Unicode
> wrote:
> > On Thu, 20 Apr 2017 11:17:05 -0700
> > Manish Goregaokar via Unicode wrote:
>
I mean, we do the same for Hangul.
The main time you need intra-conjunct segmentation in Devanagari is
when deleting something you just typed. And backspace usually operates
on code points anyway (except for some weird cases like flag emoji,
though this isn't uniform across platforms). I don't
On Thu, 20 Apr 2017 11:17:05 -0700
Manish Goregaokar via Unicode wrote:
> When given a rendered representation people seem to uniformly count
> conjuncts as multiple aksharas if rendered with visible halant, and as
> a single akshara if they are rendered conjoined.
Now,
On Thu, 20 Apr 2017 15:33:37 +0530
Shriramana Sharma via Unicode wrote:
> All I can say is that Tamil script has eschewed most consonant cluster
> ligatures/conjoining forms. As for Devanagari, writing श्रीमान्को (I
> used ZWNJ) i.o. श्रीमान्को is quite possible with
I don't think there's consensus.
When given a rendered representation people seem to uniformly count
conjuncts as multiple aksharas if rendered with visible halant, and as
a single akshara if they are rendered conjoined.
Most fonts for devanagari these days are pretty good at conjoining
Hello Richard. Yes my earlier reply wasn't intended to be offlist. I
have near-zero knowledge about non-Indic languages.
All I can say is that Tamil script has eschewed most consonant cluster
ligatures/conjoining forms. As for Devanagari, writing श्रीमान्को (I
used ZWNJ) i.o. श्रीमान्को is quite
I was offered the following reply:
> To my knowledge except in Tamil script vowel less consonants in
> written form aren't considered as separate "akshara"s in native
> terminology.
Word-finally they seem to be being treated as such. To be more
precise, a final cluster of one or more consonants
30 matches
Mail list logo