Go romanize! Re: Counting Devanagari Aksharas

Naena Guru via Unicode Mon, 24 Apr 2017 08:30:30 -0700

Quote by Richard:
Unless this implies a spelling reform for many languages, I'd like to
see how this works for the Tai Tham script.  I'm not happy with the
Romanisation I use to work round hostile rendering engines.  (My
scheme is only documented in variable hack_ss02 in the last script
blocks of http://wrdingam.co.uk/lanna/denderer_test.htm.)  For example,
there are several different ways of writing what one might naively
record as "ontarAy".


MY RESPONSE:
Richard, I stuck to the two specifications (Unicode and Font) and Sanskrit 
grammar. The akSara has two aspects, its sound (zabða, phoneme) and its shape. 
(letter, ruupa). Reduce the writing system to its consonants, vowels etc. 
(zabða) and assign SBCS letters/codes to them (ruupa). SBCS provides the best 
technical facilities for any language. (This is why now more than 130 languages 
romanize despite Unicode). Use English letters for similar sounds in the native 
speech. Now, treat all combinations as ligatures. For example, 'po' sound in 
Indic has the p consonant with a sign ahead plus a sign after. For the font, 
there is no difference between the way it makes the combination 'ä', which has 
a sign above and the Indic having two on either side. Recall that long ago, 
Unicode stopped defining fixed ligatures and asked the font makers to define 
them in the PUA.

Spelling and speech:
There is indeed a confusion about writing and reading in Hindi, as I have 
observed. Like in English and Tamil, Hindi tends to end words with a consonant. 
So, there is this habit among the Hindi speakers to drop the ending vowel, 
mostly 'a' from words that actually end with it. For example, the famous name 
Jayantha (miserable mine too, haha! = jayanþa as Romanized), is pronounced 
Jayanth by Hindi speakers. It is a Sanskrit word. Sanskrit and languages like 
Sinhhala have vowel ending and are traditionally spoken as such.

Dictionary is a commercial invention. When Caxton brought lead types to 
England, French-speaking Latin-flaunting elites did not care about the poor 
natives. Earlier, invading Romans forced them to drop Fuþark and adopt the 
22-letter Latin alphabet. So, they improvised. Struck a line across d and made 
ð, Eth; added a sign to 'a' and made æ (Asc) and continued using Thorn (þ) by 
rounding the loop. Lead type printing hit English for the second time, ruining 
it as the spell standardizing began. Dictionaries sold. THE POWERFUL CAN RUIN 
PEOPLE'S PROPERTY BECAUSE THEY CAN IN ORDER TO MAKE MONEY. Unicode enthusiasts, 
take heed!

Looking at the word you gave, ontarAy, it looks to me like an Anglicized form. 
If I am to make a guess, its ending is like in ontarAyi. Is it said something 
like, own-the-raa-yi? (danger?) If I am right, this is a good example of 
decline if a writing system owing to bad, uncaring application of technology. 
We are in the Digital Age, and we need not compromise any more. In fact, we can 
fix errors and decadence introduced by past technologies.


RICHARD:
That sounds like a letter-assembly system.

MY RESPONSE:
Nothing assembled there, my friend.



On 4/24/2017 12:38 PM, Richard Wordingham via Unicode wrote:

On Mon, 24 Apr 2017 00:36:26 +0530
Naena Guru via Unicode <[email protected]> wrote:

The Unicode approach to Sanskrit and all Indic is flawed. Indic
should not be letter-assembly systems.

Sanskrit vyaakaraNa (grammar) explains the phonemes as the atoms of
the speech. Each writing system then assigns a shape to the
phonetically precise phoneme.

The most technically and grammatically proper solution for Indic is
first to ROMANIZE the group of writing systems at the level of
phonemes. That is, assign romanized shapes to vowels, consonants,
prenasals, post-vowel phonemes (anusvara and visarjaniiya with its
allophones) etc. This approach is similar to how European languages
picked up Latin, improvised the script and even uses Simples and
Capitals repertoire. Romanizing immediately makes typing easier and
eliminates sometimes embarrassing ambiguity in Anglicizing -- you
type phonetically on key layouts close to QWERTY. (Only four
positions are different in Romanized Sinhala layout).

If we drop the capitalizing rules and utilize caps to indicate the
'other' forms of a common letter, we get an intuitively typed system
for each language, and readable too. When this is done carefully,
comparing phoneme sets of the languages, we can reach a common set of
Latin-derived SINGLE-BYTE letters completely covering all phonemes of
all Indic.

Unless this implies a spelling reform for many languages, I'd like to
see how this works for the Tai Tham script.  I'm not happy with the
Romanisation I use to work round hostile rendering engines.  (My
scheme is only documented in variable hack_ss02 in the last script
blocks of http://wrdingam.co.uk/lanna/denderer_test.htm.)  For example,
there are several different ways of writing what one might naively
record as "ontarAy".

Next, each native script can be obtained by making orthographic smart
fonts that display the SBCS codes in the respective shapes of the
native scripts.

That sounds like a letter-assembly system.

So how does your scheme help one split words into orthographic
syllables?

I have successfully romanized Sinhala and revived the full repertoire
of Sinhla + Sanskrit orthography losing nothing. Sinhala script is
perhaps the most complex of all Indic because it is used to write
both Sanskrit and Pali.

What complication does Pali impose on top of Sanskrit.  As far as I'm
aware, it just needs one extra letter, usually called LLA, which you
will already have if 'Sanskrit' includes Vedic Sanskrit.

See this: http://ahangama.com/ (It's all SBCS underneath).
Test here: http://ahangama.com/edit.htm

All I get for these are blank pages.  Perhaps there's an unreported
communication failure in the network,

Richard.

Go romanize! Re: Counting Devanagari Aksharas

Reply via email to