Hi Mohammad, Mohammad Norouzi wrote: > [Hoss wrote:] >> ...are there Persian characters with a category type of SPACE_SEPARATOR, >> LINE_SEPARATOR, or PARAGRAPH_SEPARATOR ? > > How can I know that?
The Unicode standard's codes[1] for these are: SPACE SEPARATOR: Zs LINE SEPARATOR: Zl PARAGRAPH SEPARATOR: Zp >From <http://www.unicode.org/Public/4.0-Update/PropList-4.0.0.txt>, the only characters with these properties are: 0020 ; White_Space # Zs SPACE 00A0 ; White_Space # Zs NO-BREAK SPACE 1680 ; White_Space # Zs OGHAM SPACE MARK 180E ; White_Space # Zs MONGOLIAN VOWEL SEPARATOR 2000..200A ; White_Space # Zs EN QUAD..HAIR SPACE 200B ; Other_Default_Ignorable_Code_Point # Zs ZERO WIDTH SPACE 2028 ; White_Space # Zl LINE SEPARATOR 2029 ; White_Space # Zp PARAGRAPH SEPARATOR 202F ; White_Space # Zs NARROW NO-BREAK SPACE 205F ; White_Space # Zs MEDIUM MATHEMATICAL SPACE 3000 ; White_Space # Zs IDEOGRAPHIC SPACE Modern Persian uses Arabic orthography with four additional letters[2] -- peh, tcheh, jeh, and gaf -- all of which are included in the Unicode basic Arabic character set. The Arabic Unicode character ranges are: [U+0600 - U+06FF] <http://www.unicode.org/charts/PDF/U0600.pdf> [U+0750 - U+077F] <http://www.unicode.org/charts/PDF/U0750.pdf> [U+FB50 - U+FC3F] <http://www.unicode.org/charts/PDF/UFB50.pdf> [U+FE70 - U+FEFF] <http://www.unicode.org/charts/PDF/UFE70.pdf> The intersection of the sets { all Arabic characters } and { all Unicode whitespace characters } is the null set. Thus, it appears, there are no Arabic-specific (and hence Persian-specific) whitespace characters in the Unicode standard. Steve [1] Unicode 4.0.0 Character Database - Property value codes: <http://www.unicode.org/Public/4.0-Update/UCD-4.0.0.html#Property_Values> [2] http://en.wikipedia.org/wiki/Persian_alphabet -- Steve Rowe Center for Natural Language Processing http://www.cnlp.org/tech/lucene.asp --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]