[farsiweb]Re: FARSI HEH WITH HAMZA
From: Roozbeh Pournader [EMAIL PROTECTED] Date: Sat, 15 Jun 2002 03:07:21 +0430 (IRST) On Fri, 14 Jun 2002 [EMAIL PROTECTED] wrote: [...] Mr. Pournader had mentioned that he would summarize the discussion points, Unfortunately I can't do that, since the discussions did not converge. I am not sure what Roozbeh means by saying that the discussion did not converge. I think that I can converge it very conveniently as follows: This discussion commenced when I queried why the glyph U+06C0 was rejected from the Persian IT standard. Two general objections were raised to its use: The first was that hamzeh changes its shape in Farsi composition; and the second was that U+06C0, as encoded in the Unicode standard, does not decompose correctly into its Farsi equivalents of heh + hamzeh above. As regards the first objection, there are two points here: (1) Whether hamzeh changes its shape, and (2) whether that has any bearing on the discussion. I think that I have successfully argued that (1) hamzeh does not change its shape, and (2) that even if it did, that is irrelevant to the discussion. Hamzeh is just a shape. I has no intrinsic value or significance of its own. It assumes whatever significance we choose to give it by convention. Whether it changes its shape or not, that is irrelevant to the discussion of implementing heh + hamzeh in the IT standard. As regards the second objection, that U+06C0 is not compatible with Farsi usage, that is a valid argument for not using this particular character in the Farsi standard, therefore it should not be used. That argument is settled. The question therefore is not whether we should use U+06C0 in the Farsi standard or not, but whether it is desirable to have an independent glyph in the Unicode standard with a unique code point value (call it U+) which DOES correctly decompose into its Farsi equivalents of heh + hamzeh above. I believe that the answer to this question is a definitive yes, for exactly the same reasons that is desirable to have alef + hamzeh above, or vav + hamzeh above encoded as independent glyphs. Why is it that these shapes are encoded in Unicode as independent glyphs with unique code point values? Why don't they just enter them as two separate characters on the keyboard? There are at least two good reasons, as I have discussed before: (1) because of the difficulty of correctly representing them in font systems, and (2) because they are used so frequently that it is more efficient and economical to treat them as single characters which can be entered with single keystrokes rather than two. Exactly the same argument holds true for heh + hamzeh above. This glyph is used in Farsi so frequently (much more frequently than alef + hamzeh or vav + hamzeh for example) that it is desirable that it should be treated as a single character which can be entered by a single keystroke rather than two; and the same difficulty of representing them in font systems equally applies to this glyph. It is therefore equally desirable that it should be recognised in the Unicode charts as an independent glyph with its own unique code value U+. The question therefore boils down to whether it is practical, and technically feasible, to encode such a glyph in the Unicode charts? I do not claim to be an expert on the Unicode standard, but as far as my understanding of the subject goes, there is no problem, therefore it should be implemented. As far as I know, the committee who drafted the Persian IT standard have never attempted to have it implemented in the Unicode standard. The question here is, Why have they not attempted to implement it, and whether they have any objections to its implementation? If they just forgot to do it at the time, or if it did not occur to them to do so, it is not too late to do so now. If they have some other objection to its implementation, I would like to know what that objection is. The recommendation in the Persian IT standard that the glyph can be entered by two keystrokes may be the easiest solution; but it is not it is neither the most logical, nor the most sensible, nor the most professional solution. CONCLUSION -- I will here briefly summarise the main points of the discussion as follows: 1. It is desirable that the glyph heh + hamzeh above should be recognised in the Unicode standard as a single shape with a unique code point value so that it can treated as a single character both in fonts as well as on the Farsi keyboard. 2. The Unicode glyph U+06C0 is not suitable for that purpose because its decomposition in the Unicode standard is not compatible with Farsi usage, therefore it cannot be used. 3. It is therefore desirable that the implementation of a unique glyph in the Unicode charts which does decompose correctly into its Farsi equivalents of heh + hamzeh above be recommended to the Unicode consortium by the drafters of the Persian IT standard. 4. There
[farsiweb]Farsi heh + hamzeh
I have a more radical solution to the problem of implementing the Farsi heh + hamzeh in the Persian IT standard. In the Unicode system, a given shape or glyph can be encoded with more than one code value. For example, the number {7} in the Farsi character set and in the Arabic character sets look identical in appearance, but have different code numbers assigned to them (i.e. 06F7 and 0667 respectively). The same treatment can be given to heh + hamzeh . If the current implementation of this glyph in Unicode is not compatible with the Farsi usage, the best answer to that is to recommend to the Unicode Consortium the adoption of another glyph, or the implementation of the same glyph but with a different code value, and with the added attributes of being analyzable into the correct Unicode components which are compatible with the Farsi script. This can then be safely added to the Persian IT standard without a problem, and software developers and font designers everywhere will know exactly how to implement the glyph, and there will be no reason for any confusion over the issue. I hope that the FarsiWeb development team are listening, and will give this suggestion serious consideration. Abi _ MSN Photos is the easiest way to share and print your photos: http://photos.msn.com/support/worldwide.aspx ___ FarsiWeb mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/farsiweb
[farsiweb]Re: Farsi heh + hamzeh
From: Ali Khanban [EMAIL PROTECTED] To: Abi Lover [EMAIL PROTECTED] CC: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: [PersianComputing] Re: [farsiweb] Farsi heh + hamzeh above Date: Sat, 25 May 2002 16:12:52 +0100 I am not agree with you in your conclusion. Again I insist that there is a letter called hamza in Arabic and it is used in Farsi as you described. But the fact is that this is one letter with different shapes according to the sound and place of it in the word. In mo'men we can't say that vav+hamza sounds as o + stop, because there is an o which is not written, as we always omit them. And because of that o, the letter hamza is written on a base shaped as vav. I disagree! Hamzeh does not take different shapes. It has only one shape, and it always represents the glottal stop (or plosive). Vav + hamzeh is not simply a different shape of hamzeh. It represents o + stop. The letter vav in Farsi sometimes represents the sound o (as in khod), and sometimes the sound u or oo (as in bood). It is not correct to say that in a word like mo'men, the o is not written. The vav represents the o. What about mas'ul? Where does the oo sound comes from here? Try writing it with a dandaneh! hamza is a letter with different shapes. These shapes are not always the same in Farsi and Arabic. For example, in Farsi we use a shape dandaneh for hamza in pangu'an which would be a shape like vav if we had used the Arabic style. These are just different ways spelling a foreign word in the Arabic script. It does not mean anything. The point of all this discussion here as far as I am concerned is not to get bogged down in the intricacies of phonetics. The point here is that, as C Bobroff has pointed out, There is a relationship in Persian between 'yeh' and 'hamzeh'. (I think he means 'heh' not 'yeh'.) That relationship can and should be recognized in Farsi fonts and keyboards. In the ParsNegar word processor (which is nothing special, and full of bugs!) they have had the good sense to recognize the need for this particular shape, and have implemented it both in their font as a ligature, as well as in their software, and you are able to enter it with a single key stroke, instead of having to type two key strokes. That is what I am getting at. That is the correct approach, and it should be adopted by official Farsi keyboard and font standards. Abi _ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp. ___ FarsiWeb mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/farsiweb
[farsiweb]Farsi, heh, +, hamzeh, above
I don' agree with Khanban's reasons for not using the letter form heh + hamzeh above. The same reasons could be given for not using vav + hamzeh above. For example, {mas'ul} could also be written as {mas ool} (with alef instead of hamzeh), and {so'a^l} could also be written as {so aal} (with alef-madd instead of hamzeh); and it is quite possible that in the distant future people will start writing them that way. However, if it is true that the Unicode standard encodes this shape in a way that is not compatible with Farsi, then that would be a justifiable reason for not adopting it in the standard. But in that case, it should be explained in the standard why it cannot be adopted, and it should also be explained (especially for the benefit of software developers for whom Farsi is not the native language) that this shape is commonly used in Farsi, and that there is nothing to stop font dev! elopers and application developers from supporting this shape as a ligature, provided that it is properly implemented so that it can be correctly parsed into its appropriate Unicode equivalents. I have also noticed that on the latest ISRI standard for a Farsi keyboard layout, this shape is not supported either. It supports vav + hamzeh above, and even supports some obscure Arabic characters which are hardly ever used in Farsi, such as the Arabic round T, and Arabic yeh with two dots below, but not heh + hamzeh above, which is extensively used in farsi. There is no justification for this. The purpose of such standards should not be to tell people how to write Farsi. People decide how to write Farsi. The standard should encode and standardize what people write. A keyboard layout is not dependent on Unicode encodings. Since this letter form is used extensively in Farsi, it should be possible to enter it with a single stroke of the keyboard, as is the case with vav + hamzeh above, instead of having to type two key strokes to write it; and font and software d! evelopers should be guided to support it as a ligature in their fonts and applications. AbiSend and receive Hotmail on your mobile device: Click Here ___ FarsiWeb mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/farsiweb