Re: Akkha script (used by Eastern Magar language) in ISO 15924?
> On Jul 23, 2019, at 12:26 AM, Richard Wordingham via Unicode > wrote: > > On Mon, 22 Jul 2019 17:42:37 -0700 > Anshuman Pandey via Unicode wrote: > >> As I pointed out in L2/11-144, the “Magar Akkha” script is an >> appropriation of Brahmi, renamed to link it to the primordialist >> daydreams of an ethno-linguistic community in Nepal. I have never >> seen actual usage of the script by Magars. If things have changed >> since 2011, I would very much welcome such information. Otherwise, >> the so-called “Magar Akkha” is not suitable for encoding. The Brahmi >> encoding that we have should suffice. > > How would mere usage qualify it as a separate script? I apologize for using the wrong conjunction. Instead of “otherwise” I should have written “nevertheless”. All my best, Anshu
Re: Akkha script (used by Eastern Magar language) in ISO 15924?
As I pointed out in L2/11-144, the “Magar Akkha” script is an appropriation of Brahmi, renamed to link it to the primordialist daydreams of an ethno-linguistic community in Nepal. I have never seen actual usage of the script by Magars. If things have changed since 2011, I would very much welcome such information. Otherwise, the so-called “Magar Akkha” is not suitable for encoding. The Brahmi encoding that we have should suffice. All my best, Anshu > On Jul 22, 2019, at 10:06 AM, Lorna Evans via Unicode > wrote: > > Also: https://scriptsource.org/scr/Qabl > > >> On Mon, Jul 22, 2019, 12:47 PM Ken Whistler via Unicode >> wrote: >> See the entry for "Magar Akkha" on: >> >> http://linguistics.berkeley.edu/sei/scripts-not-encoded.html >> >> Anshuman Pandey did preliminary research on this in 2011. >> >> http://www.unicode.org/L2/L2011/11144-magar-akkha.pdf >> >> It would be premature to assign an ISO 15924 script code, pending the >> research to determine whether this script should be separately encoded. >> >> --Ken >> >>> On 7/22/2019 9:16 AM, Philippe Verdy via Unicode wrote: >>> According to Ethnolog, the Eastern Magar language (mgp) is written in two >>> scripts: Devanagari and "Akkha". >>> >>> But the "Akkha" script does not seem to have any ISO 15924 code. >>> >>> The Ethnologue currently assigns a private use code (Qabl) for this script. >>> >>> Was the addition delayed due to lack of evidence (even if this language is >>> official in Nepal and India) ? >>> >>> Did the editors of Ethnologue submit an addition request for that script >>> (e.g. for the code "Akkh" or "Akha" ?) >>> >>> Or is it considered unified with another script that could explain why it >>> is not coded ? If this is a variant it could have its own code (like >>> Nastaliq in Arabic). Or may be this is just a subset of another >>> (Sino-Tibetan) script ? >>> >>> >>>
Fwd: L2/18-181
> On May 16, 2018, at 3:46 PM, Doug Ewell via Unicode> wrote: > > http://www.unicode.org/L2/L2018/18181-n4947-assamese.pdf > > This is a fascinating proposal to disunify the Assamese script from > Bengali on the following bases: ‘Fascinating’ is a not a term I’d use for this proposal. If folks are interested in a valid proposal for disunification of Bengali, please look at the proposal for Tirhuta. > 1. The identity of Assamese as a script distinct from Bengali is in > jeopardy. This is not a technical matter. Moreover, its typical rhetoric used by various language communities in South Asia. Fairly standard fare for those familiar with such issues. The proposal needs to show how the two scripts differ, ie. conjuncts, CV ligatures, etc. The number forms are similar to those already encoded. Again, cf. Tirhuta. > 2. Collation is different between the Assamese and Bengali languages, > and code point order should reflect collation order. The same issue applies to dictionary order for Hindi, Marathi, which differ from the conventional Sanskrit order for Devanagari. Orthographies for various languages put conjuncts and other things at the end, which are not considered atomic letters. Nothing special in this regard for Assamese and Bengali. > 3. Keyboard design is more difficult because consonants like ক্ষ > are encoded as conjunct forms instead of atomic characters. Ignorant question on my part: is it difficult to use character sequences as labels for keys? I see keys for both क्ष and ज्ञ on the iOS Hindi keyboard, and त्र is tucked away under त. > 4. The use of a single encoded script to write two languages forces > users to use language identifiers to identify the language. Same applies to each of the 40+ varieties of Hindi, as well as Marathi, etc. Another ignorant question: how to identify the various languages that use Arabic and Cyrillic? > 5. Transliteration of Assamese into a different script is problematic > because letters have different phonological value in Assamese and > Bengali. Transliteration or transcription? In any case, this applies to other languages written using similar scripts: a Marathi speaker pronounces ज and ऋ differently than a Hindi speaker does. > It will be interesting to see where this proposal goes. Hopefully, it does not go too far. What it proposes is contrary to Unicode and redundant. > Given that all > or most of these issues can be claimed for English, French, German, > Spanish, and hundreds of other languages written in the Latin script, if > the Assamese proposal is approved we can expect similar disunification > of the Latin script into language-specific alphabets in the future. Fascinating. I mean, terrible. All my best, Anshuman
Re: 0027, 02BC, 2019, or a new character?
> On Feb 20, 2018, at 9:49 PM, James Kass via Unicode> wrote: > > Michael Everson wrote: > >> Orthographic harmonization between these languages can ONLY help any >> speaker of one to access information in any of the others. That expands >> people’s worlds. That would be a good goal. > > Wouldn't dream of arguing with that. Expanding people's worlds is why > many of us have supported Unicode. Agreed! > The good news is that the thread title question is moot. Yes, now let’s please return to discussing emoji. All my best, Anshu
End of discussion, please — Re: Why so much emoji nonsense?
> On Feb 15, 2018, at 10:58 PM, Pierpaolo Bernardi via Unicode >wrote: > > On Fri, Feb 16, 2018 at 4:26 AM, James Kass via Unicode > wrote: > >> The best time to argue against the addition of emoji to Unicode would be >> 2007 or 2008, but you'd be wasting your time travel. Trust me. > > But it's always a good time to argue against the addition of more > nonsense to what we already have got. I think it’s a good time to end this conversation. Whether ‘nonsense’ or not, emoji are here and they’re in Unicode. This conversation has itself become nonsense, d’y’all agree? The amount of time that people have spent on this discussion could’ve been directed towards work on any one of the unencoded scripts listed at: http://www.linguistics.berkeley.edu/sei/scripts-not-encoded.html As many have noted during this discussion, the emoji “ship has already sailed”. I’d’ve jumped aboard sooner, but this metaphor is now also quite tired. All my best, Anshu
Re: Emoji for major planets at least?
Proposals for planet emoji were submitted in April 2017: https://www.unicode.org/L2/L2017/17100-planet-emoji-seq.pdf http://www.unicode.org/L2/L2017/17100r-planet-emoji-seq.pdf I’m not sure what the result was. Anshu > On Jan 18, 2018, at 12:46 PM, Asmus Freytag (c) via Unicode >wrote: > >> On 1/18/2018 10:01 AM, John H. Jenkins wrote: >> Well, you can go with Venus = white planet, Mercury = grey planet, Uranus = >> greenish planet, Neptune = bluish planet, Jupiter = striped planet. >> >> As you say, though, without a context, none of them convey much and Venus, >> at least, would just be a circle. >> >> Plus there's the question of the context in which someone would want to send >> little pictures of the planets. This sounds like it would be adding emoji >> just because. > > "Earth" as in "a blue ball in space" is something that reached iconic status > after the famous photo taken during the early Apollo missions. I could > definitely see that used in a variety of possible contexts. And the > recognition value is higher than for many recent emoji. > > Saturn, with its rings (even though it's no longer the only one known with > rings) also is iconic and highly recognizable. I lack imagination as to when > someone would want to use it in communication, but I have the same issue with > quite a few recent emoji, some of which are far less iconic or recognizable. > I think it does lend itself to describe a "non-earth" type planet, or even > the generic idea of a planet (as opposed to a star/sun). > > Mars and Venus have tons of connotations, which could be expressed by using > an emoji (as opposed to the astrological symbol for each), but only Mars is > reasonably recognizable without lots of pre-established context. That red > color. > > In a detailed enough rendering, Jupiter, as a shaded "ball" with stripes and > red dot would more recognizable than any of the remaining planets (on par or > better with many recent emoji), but I see even less scope for using it > metaphorically or in extended contexts. > > If someone were to make a proposal, I would suggest to them to limit it to > these four and to provide more of a suggestion as to how these might show up > in use. > > A./ >> >>> On Jan 18, 2018, at 10:44 AM, Asmus Freytag via Unicode >>> wrote: >>> On 1/18/2018 6:55 AM, Shriramana Sharma via Unicode wrote: Hello people. We have sun, earth and moon emoji (3 for the earth and more for the moon's phases). But we don't have emoji for the rest of the planets. We have astrological symbols for all the planets and a few non-existent imaginary "planets" as well. Given this, would it be impractical to encode proper emoji characters for the rest of the planets, at least the major ones whose physical characteristics are well known and identifiable? I mean for example identifying Sedna and Quaoar (https://en.wikipedia.org/wiki/File:EightTNOs.png) is probably not going to be practical for all those other than astronomy buffs but the physical shapes of the major planets are known to all high school students… >>> Earth = blue planet (with clouds) >>> >>> Mars = red planet >>> >>> Saturn = planet with rings >>> >>> I don't think any of the other ones are identifiable in a context-free >>> setting, unless you draw a "big planet with red dot" for Jupiter. >>> >>> Earth would have to be depicted in a way that doesn't focus on >>> "hemispheres", or you miss the idea of it as "planet". >>> >>> >>> >>> A./ >>> >>> >>> >> >
The need for a basic register of emoji submissions
There is a need for a basic register of proposals that have been submitted to the Emoji Subcommittee. Currently, emoji proposals are posted to the UTC register after they have been reviewed by the ESC as being actionable by the UTC. For proposals that make the cut, some time can pass between the date of submission and the date they are posted. For proposals that are deemed unsuitable, there is simply no public record. Consequently, there is no way to know if a particular emoji has been proposed, either while a submitted proposal is being reviewed or if a proposal has been rejected. The "Submitting Emoji Proposals" page at http://unicode.org/emoji/selection.html quixotically notifies the reader using bold face to "check the Emoji List to make sure your proposal is new": this list contains emoji that have already been encoded. This is a problem. There have been three instances where I have worked on emoji proposals only to later learn that they were already proposed earlier. And I learned that only because I check the UTC register frequently for my script encoding efforts. If there were a basic register of emoji submissions, I could have easily checked it and saved the hours I spent in drawing up documents. The de facto rationale for not posting emoji proposals to the UTC register right away is that 'there are too many proposals that are unactionable or of insufficient quality'. But, I think this rationale does not hold water too well. A basic task of a standards subcommittee is to maintain a list of artifacts that pertain to its function. For the ESC, these artifacts include all emoji submissions. And a list of these artifacts can easily be made available at http://unicode.org/emoji. So, that instead of pointing prospective emoji proposal authors to a list of already encoded emoji, they can be pointed to a list of emoji submissions. This basic register can be as simple as a list of names. If the ESC wishes to not post other details, that is fine. I am not asking for a Roadmap. I see from the announcement made yesterday that the ESC now has (at least) four members. Congratulations to the new members, who I believe to be highly capable of maintaining a simple public list of emoji submissions in short time. All my best, Anshu
Re: Comparing Raw Values of the Age Property
I performed several operations on DerivedAge.txt a few months ago. One basic example here: https://pandey.github.io/posts/unicode-growth-UCD-python.html If you provide some more insight into your objective, I might be able to help. I would recommend against relying on the order of the data, and that you instead parse the individual entries to obtain the 'Age' property. All my best, Anshu > On May 22, 2017, at 4:44 PM, Richard Wordingham via Unicode >wrote: > > Given two raw values of the Age property, defined in UCD file > DerivedAge.txt, how is a computer program supposed to compare them? > Apart from special handling for the value "Unassigned" and its short > alias "NA", one used to be able to compare short values against short > values and long values against long values by simple string > comparison. However, now we are coming to Version 10.0 of Unicode, > this no longer works - "1.1" < "10.0" < "2.0". > > There are some possibilities - the values appear in order in > PropertyValueAliases.txt and in DerivedAge.txt. However, I can find no > relevant guarantees in UAX#44. I am looking for a solution that can be > driven by the data files, rather than requiring human thought at every > version release. Can one rely on the FULL STOP being the field > divider, and can one rely on there never being any grouping characters > in the short values? Again, I could find no guarantees. > > Richard.
Re: Counting Devanagari Aksharas
> On Apr 20, 2017, at 8:19 PM, Richard Wordingham via Unicode >wrote: > > On Thu, 20 Apr 2017 14:14:00 -0700 > Manish Goregaokar via Unicode wrote: > >> On Thu, Apr 20, 2017 at 12:14 PM, Richard Wordingham via Unicode >> wrote: > >>> On Thu, 20 Apr 2017 11:17:05 -0700 >>> Manish Goregaokar via Unicode wrote: > I'm of the opinion that Unicode should start considering devanagari (and possibly other indic) consonant clusters as single extended grapheme clusters. > >>> You won't like it if cursor movement granularity is reduced to one >>> extended grapheme cluster. I'm grateful that Emacs allows me to > >> I mean, we do the same for Hangul. > > Hangul is generally a maximum of three characters, which is about the > border of tolerance. I find it irritating to have to completely retype > Thai grapheme clusters of consonant, vowel and tone mark. There were > loud protests from the Thais when preposed vowels were added to the > Thai grapheme cluster and implementations then responded, and Unicode > quickly removed them. Now imagine you're typing Vedic Sanskrit, with its > clusters and pitch indicators. I tried typing Vedic Sanskrit, and it seems to work: http://pandey.pythonanywhere.com/devsyll Haven't tried the orthographic oddity of the Nepali case in question. Above my pay grade. If you access the above link on an iOS device you'll see tofu and missing characters. Apple's Devanagari font needs to be fixed. - AP