Tom, Since I never really studied grammer of any language in dept, it is very difficult for me to follow you discussion. I think I sort of get it, but not to sure.
Anyway, let me try to understand you. > >> > >> 1. this funny alif-in-dotless-yeh-clothing (Quranic and contemporary); > >> 2. a dotless-yeh *form* that has no meaning and is used solely as a seat > of hamza/small alef/etc. (Quranic and contemporary) > >> 3. a true yeh that sometimes loses its spots (Quranic and occasionally > contemporary); > >> 4. a true yeh that always keeps its dots (contemporary usage)>> > > Nrs 1 and 2 are one and the same grapheme. (Unicode character > Yeh_Hamza-Above is a of course a non-existent ligature in this approach). > Nrs 3 and 4 flavours are of the same grapheme. > > >> This is why I think the best approach would be to encode all four of > >> these cases with the same yeh codepoint. > > I fully agree. 1/2 and 3/4 are all a single grapheme. They look like ducks, > the quack like a ducks, they are b-y ducks. And like the hamza, the twindots > are also to be encoded as a separate character > So, you are suggesting to encode everything using a single code, which probably something like 649.Then, encode the 2 dots, hamza, small alefs seperately, right? Could you explain what is the benefit of using this approach? First impression, I think it will make it more difficult, especially for searching. For example, if I were to search for normal yeh, I need to include the dot as well in my searching for initial and medial form. But for standalone and final form, it is not necessary, right? Regards.
_______________________________________________ General mailing list [email protected] http://lists.arabeyes.org/mailman/listinfo/general

