Re: Akkha script (used by Eastern Magar language) in ISO 15924?

2019-07-22 Thread Philippe Verdy via Unicode
So can I conclude that what The Ethnologue displays (using a private-use
ISO 15924 "Qabl") is wrong ?
And that translations classified under "mgp-Brah" are fine (while
"mgp-Qabl" would be unusable for interchange) ?


Le mar. 23 juil. 2019 à 02:42, Anshuman Pandey  a écrit :

> As I pointed out in L2/11-144, the “Magar Akkha” script is an
> appropriation of Brahmi, renamed to link it to the primordialist daydreams
> of an ethno-linguistic community in Nepal. I have never seen actual usage
> of the script by Magars. If things have changed since 2011, I would very
> much welcome such information. Otherwise, the so-called “Magar Akkha” is
> not suitable for encoding. The Brahmi encoding that we have should suffice.
>
> All my best,
> Anshu
>
> On Jul 22, 2019, at 10:06 AM, Lorna Evans via Unicode 
> wrote:
>
> Also: https://scriptsource.org/scr/Qabl
>
>
> On Mon, Jul 22, 2019, 12:47 PM Ken Whistler via Unicode <
> unicode@unicode.org> wrote:
>
>> See the entry for "Magar Akkha" on:
>>
>> http://linguistics.berkeley.edu/sei/scripts-not-encoded.html
>>
>> Anshuman Pandey did preliminary research on this in 2011.
>>
>> http://www.unicode.org/L2/L2011/11144-magar-akkha.pdf
>>
>> It would be premature to assign an ISO 15924 script code, pending the
>> research to determine whether this script should be separately encoded.
>>
>> --Ken
>> On 7/22/2019 9:16 AM, Philippe Verdy via Unicode wrote:
>>
>> According to Ethnolog, the Eastern Magar language (mgp) is written in two
>> scripts: Devanagari and "Akkha".
>>
>> But the "Akkha" script does not seem to have any ISO 15924 code.
>>
>> The Ethnologue currently assigns a private use code (Qabl) for this
>> script.
>>
>> Was the addition delayed due to lack of evidence (even if this language
>> is official in Nepal and India) ?
>>
>> Did the editors of Ethnologue submit an addition request for that script
>> (e.g. for the code "Akkh" or "Akha" ?)
>>
>> Or is it considered unified with another script that could explain why it
>> is not coded ? If this is a variant it could have its own code (like
>> Nastaliq in Arabic). Or may be this is just a subset of another
>> (Sino-Tibetan) script ?
>>
>>
>>
>>


Re: Akkha script (used by Eastern Magar language) in ISO 15924?

2019-07-22 Thread Anshuman Pandey via Unicode
As I pointed out in L2/11-144, the “Magar Akkha” script is an appropriation of 
Brahmi, renamed to link it to the primordialist daydreams of an 
ethno-linguistic community in Nepal. I have never seen actual usage of the 
script by Magars. If things have changed since 2011, I would very much welcome 
such information. Otherwise, the so-called “Magar Akkha” is not suitable for 
encoding. The Brahmi encoding that we have should suffice.

All my best,
Anshu

> On Jul 22, 2019, at 10:06 AM, Lorna Evans via Unicode  
> wrote:
> 
> Also: https://scriptsource.org/scr/Qabl
> 
> 
>> On Mon, Jul 22, 2019, 12:47 PM Ken Whistler via Unicode 
>>  wrote:
>> See the entry for "Magar Akkha" on:
>> 
>> http://linguistics.berkeley.edu/sei/scripts-not-encoded.html
>> 
>> Anshuman Pandey did preliminary research on this in 2011.
>> 
>> http://www.unicode.org/L2/L2011/11144-magar-akkha.pdf
>> 
>> It would be premature to assign an ISO 15924 script code, pending the 
>> research to determine whether this script should be separately encoded.
>> 
>> --Ken
>> 
>>> On 7/22/2019 9:16 AM, Philippe Verdy via Unicode wrote:
>>> According to Ethnolog, the Eastern Magar language (mgp) is written in two 
>>> scripts: Devanagari and "Akkha".
>>> 
>>> But the "Akkha" script does not seem to have any ISO 15924 code.
>>> 
>>> The Ethnologue currently assigns a private use code (Qabl) for this script.
>>> 
>>> Was the addition delayed due to lack of evidence (even if this language is 
>>> official in Nepal and India) ?
>>> 
>>> Did the editors of Ethnologue submit an addition request for that script 
>>> (e.g. for the code "Akkh" or "Akha" ?)
>>> 
>>> Or is it considered unified with another script that could explain why it 
>>> is not coded ? If this is a variant it could have its own code (like 
>>> Nastaliq in Arabic). Or may be this is just a subset of another 
>>> (Sino-Tibetan) script ?
>>> 
>>> 
>>> 


Re: New website

2019-07-22 Thread Asmus Freytag via Unicode

  
  
On 7/22/2019 10:00 AM, Ken Whistler via
  Unicode wrote:

Your
  helpful suggestions will be passed along to the people working on
  the new site.
  
  
  In the meantime, please note that the link to the "Unicode
  Technical Site" has been added to the left column of quick links
  in the page bottom banner, so it is easily available now from any
  page on the new site.
  

(If you ever get to anything other than the "vanity" characters -
  not a given for some devices).
(Also, the "Projects" need to be their own item on the "left",
  not hidden in "basics").

A./


  
  --Ken
  
  
  On 7/22/2019 9:54 AM, Zachary Carpenter wrote:
  
  It seems that many of the concerns
expressed here could be resolved with a menu link to the
“Unicode Technical Site” on the left-hand menu bar

  
  



  



Re: Akkha script (used by Eastern Magar language) in ISO 15924?

2019-07-22 Thread Philippe Verdy via Unicode
Also we can note that "mgp" (Eastern Magari) is severely endangered
according to multiple sources include Ethnologue and the Linguist List.
This is still not the case for Western Magari (mostly on Nepal, not in
Sikkim India), where evidence is probably easier to find (where the
encoding of a new script and disunificaition from Brahmi, may then be more
easily justified with their modern use, and probably unified with the
remaining use for Eastern Magari).


Le lun. 22 juil. 2019 à 19:33, Philippe Verdy  a écrit :

>
>
> Le lun. 22 juil. 2019 à 18:43, Ken Whistler  a
> écrit :
>
>> See the entry for "Magar Akkha" on:
>>
>> http://linguistics.berkeley.edu/sei/scripts-not-encoded.html
>>
>> Anshuman Pandey did preliminary research on this in 2011.
>>
>
> That's what I said: 8 years ago already.
>
>
>> http://www.unicode.org/L2/L2011/11144-magar-akkha.pdf
>>
>> It would be premature to assign an ISO 15924 script code, pending the
>> research to determine whether this script should be separately encoded.
>>
> And before that, does it mean that texts have to use the "Brah" code for
> early classification if they are tentatively encoded with Brahmi (and
> tagged as "mgp-Brah", which should limit the impact, because there's no
> other evidence that "mgp", the modern language, is related directly to the
> old Brahmi script, when the "mgp" still did not even exist) ?
>


Re: Akkha script (used by Eastern Magar language) in ISO 15924?

2019-07-22 Thread Philippe Verdy via Unicode
Le lun. 22 juil. 2019 à 18:43, Ken Whistler  a
écrit :

> See the entry for "Magar Akkha" on:
>
> http://linguistics.berkeley.edu/sei/scripts-not-encoded.html
>
> Anshuman Pandey did preliminary research on this in 2011.
>

That's what I said: 8 years ago already.


> http://www.unicode.org/L2/L2011/11144-magar-akkha.pdf
>
> It would be premature to assign an ISO 15924 script code, pending the
> research to determine whether this script should be separately encoded.
>
And before that, does it mean that texts have to use the "Brah" code for
early classification if they are tentatively encoded with Brahmi (and
tagged as "mgp-Brah", which should limit the impact, because there's no
other evidence that "mgp", the modern language, is related directly to the
old Brahmi script, when the "mgp" still did not even exist) ?


Re: Akkha script (used by Eastern Magar language) in ISO 15924?

2019-07-22 Thread Lorna Evans via Unicode
Also: https://scriptsource.org/scr/Qabl


On Mon, Jul 22, 2019, 12:47 PM Ken Whistler via Unicode 
wrote:

> See the entry for "Magar Akkha" on:
>
> http://linguistics.berkeley.edu/sei/scripts-not-encoded.html
>
> Anshuman Pandey did preliminary research on this in 2011.
>
> http://www.unicode.org/L2/L2011/11144-magar-akkha.pdf
>
> It would be premature to assign an ISO 15924 script code, pending the
> research to determine whether this script should be separately encoded.
>
> --Ken
> On 7/22/2019 9:16 AM, Philippe Verdy via Unicode wrote:
>
> According to Ethnolog, the Eastern Magar language (mgp) is written in two
> scripts: Devanagari and "Akkha".
>
> But the "Akkha" script does not seem to have any ISO 15924 code.
>
> The Ethnologue currently assigns a private use code (Qabl) for this script.
>
> Was the addition delayed due to lack of evidence (even if this language is
> official in Nepal and India) ?
>
> Did the editors of Ethnologue submit an addition request for that script
> (e.g. for the code "Akkh" or "Akha" ?)
>
> Or is it considered unified with another script that could explain why it
> is not coded ? If this is a variant it could have its own code (like
> Nastaliq in Arabic). Or may be this is just a subset of another
> (Sino-Tibetan) script ?
>
>
>
>


Re: New website

2019-07-22 Thread Ken Whistler via Unicode
Your helpful suggestions will be passed along to the people working on 
the new site.


In the meantime, please note that the link to the "Unicode Technical 
Site" has been added to the left column of quick links in the page 
bottom banner, so it is easily available now from any page on the new site.


--Ken

On 7/22/2019 9:54 AM, Zachary Carpenter wrote:
It seems that many of the concerns expressed here could be resolved 
with a menu link to the “Unicode Technical Site” on the left-hand menu bar


Re: Akkha script (used by Eastern Magar language) in ISO 15924?

2019-07-22 Thread Ken Whistler via Unicode

See the entry for "Magar Akkha" on:

http://linguistics.berkeley.edu/sei/scripts-not-encoded.html

Anshuman Pandey did preliminary research on this in 2011.

http://www.unicode.org/L2/L2011/11144-magar-akkha.pdf

It would be premature to assign an ISO 15924 script code, pending the 
research to determine whether this script should be separately encoded.


--Ken

On 7/22/2019 9:16 AM, Philippe Verdy via Unicode wrote:
According to Ethnolog, the Eastern Magar language (mgp) is written in 
two scripts: Devanagari and "Akkha".


But the "Akkha" script does not seem to have any ISO 15924 code.

The Ethnologue currently assigns a private use code (Qabl) for this 
script.


Was the addition delayed due to lack of evidence (even if this 
language is official in Nepal and India) ?


Did the editors of Ethnologue submit an addition request for that 
script (e.g. for the code "Akkh" or "Akha" ?)


Or is it considered unified with another script that could explain why 
it is not coded ? If this is a variant it could have its own code 
(like Nastaliq in Arabic). Or may be this is just a subset of another 
(Sino-Tibetan) script ?






Re: Displaying Lines of Text as Line-Broken by a Human

2019-07-22 Thread Richard Wordingham via Unicode
On Sun, 21 Jul 2019 20:53:19 -0700
Asmus Freytag via Unicode  wrote:

> There's really no inherent need for many spacing combining marks to
> have a base character. At least the ones that do not reorder and that
> don't overhang the base character's glyph.

We are in agreement here.

> As far as I can  tell, it's largely a convention that originally
> helped identify clusters and other lack of break opportunities. But
> now that we have separate properties for segmentation, it's not
> strictly necessary to overload the combining property for that
> purpose.

Which relates to the separate question I asked about breaking at
grapheme boundaries.  Interestingly, I'm not seeing breaks next to an
invisible stacker, but that may be because Pali subscript consonants
only slightly increase the width of the cluster.

The need for a base makes sense for reordering spacing marks, but should
be to detect editing errors, not deliberate effects.  An unreordered
rordering mark plus consonant is visually ambiguous with consonant plus
reordering mark.

> In you example, why do you need the ZWJ and dotted circle?

The user- and application-supplied text would be
.

> Originally, just applying a combining mark to a NBSP should normally
> show the mark by itself. If a font insists on inserting a dotted
> circle glyph, that's not required from a conformance perspective -
> just something that's seen as helpful (to most users).

It's not the font that inserts the dotted circle, it's the rendering
engine.  That's why the USE set Tai Tham rendering back several
years.  Now, there is at least one renderer (HarfBuzz) for which a
cunning font can work out whether the renderer has introduced the
dotted circle glyph rather than it being in the text to be rendered.  I
am looking for a general font-level solution to the problem that would
even work on Windows 10.

The ZWJ seems a reasonable hint that the space should be rendered with
zero width.  Do you think it is reasonable for  to
have zero width contribution from the NBSP when the spacing mark has a
non-overhanging glyph? It seems to be an unstandardised area, but zero
width might be considered to violate the character identity of NBSP.

I also have the problem of visually line-final U+1A6E TAI THAM VOWEL
SIGN E, which needs to be separated from a preceding consonant in the
backing store.  It seems to be particularly common before the holes
(two per page) for the string that holds the pages together.   Perhaps
the scribe tried to avoid line-final U+1A6E.

There are examples of these issues in Figure 9b of
http://www.unicode.org/L2/L2007/07007r-n3207r-lanna.pdf .  The last
syllable of _cattāro_ 'four' straddles lines 2 and 3, with its first
glyph (corresponding to SIGN E) ending line 2, and 
starting line 3.

The antepenultimate syllable of _sammodamānehi_ (misspelt
_samoddamānehi_) 'pleasing' is split between lines 7 and 8, with line 7
ending in MA and line 8 starting in SIGN AA.

I am looking for advice on what is the least bad readily achievable
solution. I can then adapt that to cope with the messier issue of the
non-spacing character U+1A58 TAI THAM SIGN MAI KANG LAI, which acts
like Burmese kinzi in the Pali text I am working on.  (If one does not
know the font well, one should not put a line break next to it unless
all other options are exhausted.)  Figure 9b also has an example of this
issue.  The initial consonant of saṅkhepaṃ (misspelt saṅkheppaṃ)
'collection, summary' is on line 9, while the rest of the word,
starting , is on line 10. 

There is weird hack that currently helps with LibreOffice - inserting
CGJ turns off some parts of Indic shaping in the rest of the run.  Or
have I missed some new specification of Indic encoding?  This helps
with visually line-final SIGN E.

Richard.



Akkha script (used by Eastern Magar language) in ISO 15924?

2019-07-22 Thread Philippe Verdy via Unicode
According to Ethnolog, the Eastern Magar language (mgp) is written in two
scripts: Devanagari and "Akkha".

But the "Akkha" script does not seem to have any ISO 15924 code.

The Ethnologue currently assigns a private use code (Qabl) for this script.

Was the addition delayed due to lack of evidence (even if this language is
official in Nepal and India) ?

Did the editors of Ethnologue submit an addition request for that script
(e.g. for the code "Akkh" or "Akha" ?)

Or is it considered unified with another script that could explain why it
is not coded ? If this is a variant it could have its own code (like
Nastaliq in Arabic). Or may be this is just a subset of another
(Sino-Tibetan) script ?