On Wednesday 22 June 2005 09:08, Thomas Milo wrote: > What's the objection? It would be just as transparent as you solution. > Anyway, I like your approach. If it is to find any acceptance, there needs > to be canonical equivalence with legacy encoding accoding to this formula: > > TANWEEN = <vowel><small noon> > = conventional tanween > TAMWEEM = <vowel><small meem> > IDGHAM = <vowel><idgham code> > > Note that this is different - and better - than Meor's and my earlier > suggestion to retain full tanween followed by a modulation mark. >
I like this idea however I don't like (as you've probably guessed by now) mixing up what is pure text, in the sense that it changes the meaning of the words, and what indicates pronounciation. Therefore I would modify this such that IDGHAAM, IKHFAA AND IQLAAB (TAMWEEN) are indicated by subsequent codepoints: TAMWEEM/IQLAAB = <vowel><small nuun><iqlaab> (was using small meem) IDGHAAM = <vowel><small nuun><idghaam> (was using shadda on subsequent letter) IKHFAA = <vowel><small nuun><ikhfaa> (was sequential blahblah) and arguably, because it is redundant, I would add IDHHAAR = <vowel><small nuun><idhhaar> Likewise I would change the nuun with iqlaab, ikhfaa etc from NUUN + IQLAAB was = <nuun><small meem> to NUUN + IQLAAB = <nuun><sukuun><iqlaab> etc. This has great benefits in terms of searching in that the tajweed codes can be treated as whitespace and all vowels and sukuuns are easily identified. I would also change the name from <small nuun> to <tanween> so as to move away from glyph based naming conventions. Adopting good labels does help clearer thinking. wassalaam abdulahq _______________________________________________ General mailing list [email protected] http://lists.arabeyes.org/mailman/listinfo/general

