New beta release of Persian Fonts

2003-12-27 Thread Roozbeh Pournader
A new beta version of FarsiWeb project's Persian fonts is now available.
This contains various improvements to the original Koodak beta, and
contains new fonts Roya (Normal and Bold), Titr, and Homa. According to
our best knowledge, these are the first Unicode compliant versions of
these fonts, and the first set of fonts ever to conform to the Iranian
national standard ISIRI 6219. The fonts currently support the Persian
and Arabic languages.

This is a beta release and contains known bugs, so use the fonts at your
own risk. But we appreciate bug reports or requests for enhancements.
Bug reports should be sent to the email address [EMAIL PROTECTED].
(Please note that we cannot answer all the emails sent to the address.)

You can download your copy from:

http://www.farsiweb.info/font/farsifonts-0.2.zip

The fonts are conforming to the Unicode, ISIRI 6219, and OpenType
standards as much as possible (they will be made conforming to Adobe
glyph naming standard in a later release). The support will increase in
newer versions, which will contain more fonts, more glyphs, and most
importantly, support for small sizes for selected fonts which make them
suitable for web use.

You can freely share and distribute all of the fonts, if you don't sell
them directly, or change or rename them. Some of these fonts are
licensed under the GNU General Public License
(http://www.gnu.org/licenses/licenses.html#GPL), which will give you
even more rights as a user and developer. For more details, see the
license field of the fonts themselves, and the file COPYING or
COPYING.txt in the zip file. Other kinds of licensing is also available
from Sharif FarsiWeb Inc, which can be contacted at [EMAIL PROTECTED].

The FarsiFonts project is sponsored by the High Council of Informatics
of Iran (http://www.shci.ir/) and Sharif University of Technology
(http://www.sharif.edu/). We wish to thank them for supporting standards
and free software.

I finally wish to thank Behnam Esfahbod, Elnaz Sarbar, and Behdad
Esfahbod for their work on the fonts. This was impossible without their
labors.

Roozbeh Pournader,
Sharif FarsiWeb Inc


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: farsi. farsi! farsi? farsi:

2003-12-12 Thread Roozbeh Pournader
On Fri, 2003-12-12 at 01:25, Behdad Esfahbod wrote:
 FARSI.
 Yes.
 ;-)
 
 Disclaimer 1: This is not a Persian vs Farsi war message.
 Disclaimer 2: CC to FarsiWeb list is just informational.
 Disclaimer 3: The attached code is not in Public Domain.
 Disclaimer 4: This is a long boring message.  Your own risk.
 
 
 Two long years ago, is such a day that today is, perhaps in the
 same wee hours in the morning but in Tehran time, I have been
 polishing and wrapping up some piece of code that is has been
 called farsi since then.
 
 The story still goes more back.  Should have been in late 2000
 that Roozbeh Pournader wrote some C code to convert Unicode
 Persian text to some legacy character set called iransystem.  As
 a requirement for that, he wrote the joining code that was later
 used by me in farsi.
 
 Late 2001, my major work on FriBidi has been done, so was the
 time to use what I have been doing.  Took Roozbeh's code, cleaned
 up, plugged FriBidi, and it was what you get as farsi/fjoining/.
 I wrote some more code to fill the gap in console to handle
 harakats, and called it farsi/fconsole, and finally grabbed
 source code from script(1), hacked a few lines, and called it
 farsi/fcon.  With the helpf of font tools I borrowed from another
 project and keyboard driver I wrote down, I had finally done my
 pet called farsi that was doing me more than Akka was able to
 do (for me as a Persian).
 
 Since then the code got some clean up and some features added,
 but nothing else changed, even the user base itself that was
 limited to me, myself, and behdad.  The package named farsi was
 still waiting for me (and Roozbeh) to resolve the copyright
 status and get released, while I lost my interest in bidi console
 and it wend down into my 10GB archive of last (lost) files.
 
 Fortunately I did three small releases of the code, first on a
 local list called 'farsidev' that does not exist today anymore
 (and I cannot remember even.  Just wrote it in my ChangeLog in
 the package);  next in a list in Hebrew community, and last in
 ArabEyes.  Seems that the last one is the only one that has been
 survived history.
 
 This is the history about farsi in five paragraphs.  I also
 hacked a Red Hat 7.2 to enable Persian on console.  I later took
 some notes of what I did, and implemented it on another machine
 from my notes.  The notes are in farsiredhat directory in archive
 attached to this mail.  Note that they are pretty old.  Many
 things have changed these days.
 
 For the past few days I have been known as the most blocker of
 the whole ArabEyes project ;-).  So I first answer the questions
 I was asked about farsi, and then go through the files in
 attached archive.
 
 Muhammad Alkarouri wrote:
 
  Thanks Behdad for your reply. I would like to know,
  though, what is the expected timeframe of including
  joining in fribidi.
 
 2005.  No more, no less ;-).
 Seriously, this winter.
 
  Another question for all:
  - do you know any problem that affects using farsi
  besides bidi before joining and shaping codes, and
  some may be next stage points like interaction with
  gpm and ncurses programs?
 
 In the future, ncurses should implement its own bidi/shaping.
 But before that, both ncurses and gpm need to get some stable
 Unicode support.  I am supposed to have a look at Unicode support
 in ncurses after I'm satisfied with GNOME (FriBidi, Pango, GTK+,
 AbiWord), but most probably it's not before 2005.
 
  If there aren't I will base any future work on this
  code rather than the akka original.
 
 :D.
 
 Nadim Shakili wrote:
 
  A couple of questions though,
 
   1. Can we take this conversation to Arabeyes' developer
  mailing-list ?  I'm sure we'll want to refer back to
  all these points in the future.
 
 Sure.
 
   2. Can we come up with an alternate name to this package.
  Akka 2.0 (with no mention of the previous work or credits) ?
  suggestions ?  Behdad, its your baby, so its your call.
 
 Well, farsi is not such a bad name as long as it's used in
 English written text ;-).  Ok, it has proved to be a bad name.
 Perhaps 'farsi' is a good name, but again in written context.
 BTW, you should not need that word in English; one should always
 use Persian to refer to the language.
 
 Second, it's the Free World (as in Free Beer) of Free Software
 (as in Freedom) ;-).  Feel Free to Fu^^Hack the code.  (Free as
 in Freedom, not as in Beer.  Don't forget my Beer).
 
 Akka 2.0, may make up a good name.  I too prefer not binding a
 new name to the same functionality.  Perhaps we would want to
 give some hints and credit to pre-2.0 Akka.  Roozbeh?
 
 I'm fine with Akka, if on your website and the main README file,
 you write it this way:
 Akka (aka farsi)
 
 Another idea comes to my mind, about popping another name.  Just
 take the middle and call it 'baghdad'?  ;-).
 
 
   3. Can we, once 12 above are agreed upon, release this code
  so that its archived somewhere.  From what I remember

Re: farsi. farsi! farsi? farsi:

2003-12-12 Thread Roozbeh Pournader
 Nadim proposed something along 'beacon', as in 'bicon', as in
 'bidi con{sole,dom}'.  I like both.  'bicon' goes more with
 'fribidi's, but as we converted freebidi to fribidi, we do bicon
 to beacon too.  I'm with beacon then.

No objections.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: FarsiWeb Digest, Vol 7, Issue 1

2003-12-06 Thread Roozbeh Pournader
On Sat, 2003-12-06 at 20:07, Sam Baran wrote:
 I opt for Farsi than Persian for the dominant
 language, used in the present Iran and the colonies
 abroad.  

It's not important what you or I opt for. That is an important point.

 Iran is a nation of multi-language cultures: Fars,
 Khuzi, Baluch, Turkman, Kurd, Armani, Asuri, Yahudi,
 Gilaani, Azari, Lor, and others.  The Iranian
 nationalities have their own historically/
 independently evolved languages. Some of them are
 remnants of the natives prior to the arrival of
 Aryans, 3000 years ago.  Then, Farsi has its various
 dialects: Khoraasaani, Esfaahaani, Yazdi, Qazvini,
 Shiraazi, Tehraani, and others.  Again, Tehraani has
 its own sub-dialects: Hasan-aabaadi, Chaale-meydaani,
 Shahre-Reyi (Raazi), Shemruni, etc.

I can't see how that is related to the debate.

 Moreover, Farsi is
 the name used by the Iranian government. 

No, Persian is the name used by the Iranian government.

 All literature in Iran and Iranian colonies abroad, are labeled under Farsi.

No they are not. Let me take a random book from my bookshelf... Ok, this
is vaazhe-naame-ye riaazi o aamaar, ingilisi-faarsi, faarsi-ingilisi
published by anjoman-e riaazi-e iraan and gorooh-e riaazi o aamaar-e
markaz-e nashr-e daaneshgaahi, published in 1370 (Solar Islamic year),
ISBN 964-01-0599-6. The English title uses Persian instead of Farsi:
Dictionary of Mathematics and Statistics, English-Persian,
Persian-English.

Well, I just checked my whole library at the office. There is no single
book that is labeled Farsi around, all use Persian when they refer
to the name of the language on the title.

 Therefore, Farsi is more descriptive of the the main
 language used in the Iranian plateau.

I don't see that claim proved.

 Incidentally, Afghanistan is a multi-language country with secondary dialects.  A
 person from Herat has hard time conversing with one
 from Kandahar or Kabul: Urdu, Dari, Pushtu, etc.

Well, I was from Tehran, and I didn't have a hard time communicating in
Kabul. Some people even considered me a Herati from my accent.

Also, You should note that Pashto and Urdu are completely different
langauges from Persian (which Dari is a dialect of). They have different
grammar, different orthography, and different vocabulary.

 Bejan Baran 

Just curious: Is your name Sam or Bejan? Your signature is
contradicting your From: line.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: FarsiWeb Digest, Vol 7, Issue 1

2003-12-06 Thread Roozbeh Pournader
On Sun, 2003-12-07 at 02:34, C Bobroff wrote:
 And why didn't anyone answer Saber's PersianComputing questions last week?
 They were quite reasonable and on topic and even related to Persian
 computing!

Perhaps no one knew the answer. Really.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: What the hell is this Yeh and Keheh problem?

2003-11-17 Thread Roozbeh Pournader
On Sun, 2003-11-16 at 02:03, C Bobroff wrote:
 I believe in principle, the search engines are to consider Persian and
 Arabic Yeh to be the same.

They should consider it to be weakly equivalent. That's the term. Same
as they do for, say, capital A and small a, or a-umlaut (ä) and
normal a.

 Yet they do not.  Why?  Are the tables they are consulting faulty? 
 Are they consulting the wrong tables?  Does no one even know this is a
 problem?  Who handles this so they can fix the problem?

Google does, for example. MSN also does, if they have a Persian search.
As for contacting them, feel free to do so. I had done the same once
with all the authority I could use from the Unicode Consortium, to no
avail (in other words, I got no reply or action).

 And what if we WANT the search engine
 to distinguish between the Persian and Arabic?

You provide an option to the engine, mentioning that it shouldn't use
its equivalent tables. But can you do that with A and a in Google?


 something like the first line in the Divan of Hafez:
 alaa yaa ayyuhaa saaqi ader ka'san wa  naawelhaa
 Is that lang=fa or lang=ar?

I agree that it's a hard question. Really depends on how you are going
to write the saaqi part. Since it's pronounced /i:/, it should be
written as dotted Yeh in the Arabic language. If you're writing it with
a dotless Yeh, it should be Persian transliteration of Arabic text. Now,
how do you mark an English tranliteration of Arabic text? With en or
ar? Of course you'll use en. So in that case, you should use fa.

 I wonder why you say #1740; instead of U+06CC?? :) :)

He's a real person, not a computer programmer! Real people prefer
decimal to hexadecimal, I guess. But AmirBehzad, it's an inconvenience
to refer to Unicode characters by their decimal code. If you want to use
HTML escapes, please use the in #x06cc; format, instead of #1740;.
Both are unreadable to a casual reader, but the first is readable by an
specialist without using a scientific calculator.

 I am slowly starting to think your idea is indeed the solution to the Yeh
 and Kaf problem.  I hope the more technically astute people will
 also wake up and give you some feedback.

The solution? The solution is of course fixing every software
immediately. But I agree that AmirBehzad's is acutally a nice idea. To
detect what the browser support properly (possibly using some
JavaScript, browser sniffing, and other tricks) and then serve the
browser what it can display.

It works fine for display purposes, but there are scenarios that it not
sufficent. Let's say a user is using IE5 on Win98, and he has the
Persian Yeh bug. AmirBehzad's script serves her Arabic Yeh in medial and
initial forms. She sees everything fine. But then, she wants to search
the (already-retreived) web page using the Find menu on her browser
(which has not implemented any such Yeh equivalence). The result: she
can't find the Arabic Yeh (or the Persian one).

Another alternative story: Let's say the writer of the page likes to
say: Don't use Arabic Yehs like 'ي', use Persian Yehs like 'ی'.?
You'll agree that he will be scared when the software does him weird
things.

The best solution, is updating the software and the fonts. And nagging
to the developers of the software if that doesn't fix the problem. And
writing your own software if that did't work either. Or learning to
write software if you don't know how. Or forgetting it all if it's not
worth the effort.

 (RoozBEH, are you almost done cleaning out your Inbox??)

I'm doing it now. Next shot in 90 days.

 Perhaps the script could also check if the win9x user has IE6 in
 addition and if so, let them see Persian.

I agree.

 I would like to request that you make a simple webpage and post it
 somewhere for newbies to copy and paste. 
 It would be nice if you put a
 little alternating Persian and English content so people see how to switch
 between the two. An exterior .CSS file that is 100% compliant with
 directions for copying for one's own use would be so nice.  For test
 purposes, the Persian content should include some tricky things like
 parentheses, diacritics (tashdid, sokun, zir, zabar, pish, etc),
 zero-width-joiner, zero-width-non-joiner, heh+hamza, and something
 requiring mouseovers (or some such feature requiring the browser to
 calculate where the word is on the page.) After making everything as
 standard and compliant as possible, also put in your script, and most
 important, directions for how to copy and explanation for why it is there,
 I think this would be the best.

Very good recommendations.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: Persian Country List

2003-10-25 Thread Roozbeh Pournader
On Fri, 2003-10-24 at 21:10, Sina Ahmadian wrote:
 After searching almost all the net for a persian list of countries, I didn't 
 find anything! Some websites use the english country list while others leave 
 the users free to enter their country name. I remember I've seen a persian 
 country list somewhere online, but don't know exactly where!
 Please reply if you have such a list or know where I can find it.

Hmm, lemme see... This is what the countries are called in Iran:

http://oss.software.ibm.com/cgi-bin/icu/lx/en/utf-8/?_=fa_IRSHOWCountries=1#Countries

And this is for Afghanistan:

http://oss.software.ibm.com/cgi-bin/icu/lx/en/utf-8/?_=fa_AFSHOWCountries=1#Countries

(For some reason, the list of Afghan names are not complete. Use the
Iranian ones if one is lacking.)

Please tell me if you found any errors, as this is becoming widely
deployed, specially in IBM and Apple software.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: SOLVED: Button translation

2003-10-15 Thread Roozbeh Pournader
On Tue, 2003-10-14 at 11:04, Behdad Esfahbod wrote:

 Button Phrasing. Write button labels as imperative verbs, for
 example Save, Print. This allows users to select an action with
 less hesitation. An active phrase also fits best with the
 button's role in initiating actions, as contrasted with a more
 passive phrase. For example Find and Log In are better buttons
 than than Yes and OK.

Isn't this only about *English* button labels?

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: Translation

2003-10-05 Thread Roozbeh Pournader
On Sun, 2003-10-05 at 00:28, Peyman wrote:
 Persian has one of the most productive word formation systems.

I would appreciate seeing some statistics to back that up, like you have
done with the verbs. Do you have any?

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: Translation

2003-10-05 Thread Roozbeh Pournader
On Sat, 2003-10-04 at 12:29, Behdad Esfahbod wrote:
 Well, you're probably right, but then the suffixes are going to
 lose all their meaning as a suffix.  After a while there would be
 no common sense between words ending with ak... (and yes, there
 would be no suffix, some new words).

Guest what? The suffixes have already lost their meanings. This same
-ak is a good example. You want language control and mathematical
semantics, which is more than impossible with a language like Persian,
IMHO.

 -gaan is not anything special, it's just aan for plural,
 joined to a word ending with hah-e naamalfooz.  Just like
 saadegaan.  So it means datas.  But again AFAIK data and
 daade are both plurals.  Don't know about paadegaan.

I am sorry, I am not talking about *that* -gaan. I'm *only* talking
about the -gaan in paadegaan.

  That's an abbreviation: FTP = gharaardaad-e enteghaal-e parvande:
  gheyn, alef, pe. If you have problems with abbreviations, don't
  use them.
 
 And write gharaardaad-e enteghaal-e parvande everywhere?

Write FTP if you like, or whatever you prefer. Go with fetepe if you
like that. Or call it chiz ;-)

You're definitely not bound by any of the requirements of the Academy.

  This is the translation of the Redo menu, not the action of
  redo-ing. I agree that it's not that good, but I've not seen many good
  ones. Your suggestion?
 
 az no reminds me of reset in forms.  dobaare and tekraar
 may have the same meaning as az no, but do it better, again
 IMHO.

At last something I can pass. I'll ask the guys.

 * scroll - navardidan!
 
  The problem? Your suggestion?
 
 navardidan is completely another word, isn't?  It do not hold
 the feeling of rolling in a single direction, and it contains a
 sense of a challenge, that cannot be ignored.  My suggestion?
 Good question.

OK, from my Moaser Persian Dictionary: [adabi] dar mohit, mantaghe, yaa
masiri harekat kardan va az noghte-i be noghte-ye digar-e aan raftan.

I can't see the sense of challenge there. I agree that it's not
scrolling exactly, but what translation has the exact senses of its
original word? Time will give all the senses to it.

 * output (device) - khorooji
   (Isn't khorooji also a noun in Persian?)
 
  It's *only* a noun in Persian, as far as I can tell. I'm not getting
  what you mean. Would you explain? From what I get, is that they are
  translating the output of a program as boroon-daad, but an output
  device as dastgaah-e khorooji.
 
 Exactly.

What is the problem then?

 The problem is that, they are misusing their power to decide for
 the language!

They have been asked to do so. We need an authority for the language.
American English has Merriam-Webster, British English has Oxford, German
has Duden, and French has its Academie. They are trying their best to
provide authority. As far as I can tell, they are coming to a point of
good output.

Well, I could never ever think about defending the Academy, but I'm
doing that. Why? First, because they're having some good-enough output
(which, well, you don't agree to, which I can understand). But second,
because I've seen the anarchy out here, everyone considering
himself/herself the authority, without even consulting the references.
Haven't you? Aren't we on the same front exactly because of that?

 You and I could have been decide on many
 technicall matters, and spread it all around the world by coding
 that here and there.  But we have never done that so to decide
 for others.

We have never done that, since we know our work is not authoritative
enough. Because it has not been the result of a consensus of experts.
Academy's output is partially the result of such a consensus.

 Better the propose words and wait some 5 or 10
 years, and decide if that can be settled.  rayane is setteled
 down.  But the way they do it, they force many bodies to follow
 their word.

Well, these are not exactly *new* words. The words have been around and
used for a long time by some translators. ISI's word list (masterminded
by Dr Mashayekh) is the main source for these, as is Mohammadifar's
Computer Dictionary (published by Moaser), and the entrepreneurial works
of Dr Rohani-Rankoohi and Dr Badi', all of whom are members of the
computer terms committee (with a few other people). I can't say they
haven't seen all the references: they have. I've talked all of them (but
Dr Rohani-Rankoohi) on different matters, and I know they don't move an
inch in these waters without contacting every reference they can find on
the matter.

It's easy to start calling them fossils, as we young people love to
do, and close their dossier so easily, but we need to separate real work
from just inventing random words (like some people we know have been
doing). I really believe you should provide feedback to the Academy, and
see what reasons they have.

I'm meeting Dr Mashayekh (the head of the computer terms committee) to
talk on exactly the same subject this Wednesday morning. I promise I
forward 

Re: FarsiWeb Digest, Vol 4, Issue 9

2003-09-30 Thread Roozbeh Pournader
On Tue, 2003-09-30 at 16:38, Behnam wrote:
 But in the meantime, do you know where is the small Alef for
 putting on the Final Yeh (in hattaa for example) or Farsi Hamza
 (Yeh-e-raabet) that we put on the final Heh in this standard layout? I
 couldn't find them anywhere.

They are not in ISIRI 2901:1994. But a new version of the national
keyboard standard (due in 1382) will include them, and many other
required characters. See the archives of this list for more information,
including the exact layout.

 I'm out of luck here. I have Microsoft Office for Macintosh but it doesn't
 support right to left languages, nor Unicode. These things are reserved for
 their own platform!

MS Office for Mac is very bad with regard to right to left languages.
OpenOffice 1.1 may be a solution, but it's not integrated with Mac OS
interface good enough.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: ARABIC FARSI YEH

2003-09-29 Thread Roozbeh Pournader
On Mon, 2003-09-29 at 18:11, Behnam wrote:

 From what I see, there is no Farsi keyboard layout (on PC or Mac) that uses
 the real Farsi Yeh (U+06CC) they use either U+0649 the Arabic Alef Maksurah
 (for Mac and some PCs) or U+064A the Arabic Yeh (for some other PCs, with
 chopped off dots in the Arabic glyph in the font!)

Windows 2000 and Windows XP both use U+06CC in their 'Farsi' keyboard
layouts. I don't know about current versions of Mac OS (which has a
keyboard named 'Persian'), but Panther (Mac OS 10.3) will include a
Persian keyboard layout based on ISIRI 2901:1994.

 To my knowledge, most Farsi fonts don't have real Farsi Yeh. Some do, but
 it's actually not being used by current Farsi keyboards on either platforms.

You're somehow right. But this is something that is being fixed
everywhere, although not quickly.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


For the World's A B C's, He Makes 1's and 0's

2003-09-25 Thread Roozbeh Pournader
An Article by New York Times:

http://www.nytimes.com/2003/09/25/technology/circuits/25code.html

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Major Enhancements to the Unicode Standard: Enabling InternationalDomain Names, Expanding Worldwide Accessibility, and Reducing the DigitalDivide

2003-08-27 Thread Roozbeh Pournader
FYI.

-Forwarded Message-
From: Magda Danish (Unicode) [EMAIL PROTECTED]
To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL 
PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Major Enhancements to the Unicode Standard: Enabling International Domain 
Names, Expanding Worldwide Accessibility, and Reducing the Digital Divide
Date: Wed, 27 Aug 2003 06:11:25 -0700

Major Enhancements to the Unicode Standard: 
Enabling International Domain Names, Expanding Worldwide Accessibility,
and Reducing the Digital Divide 

Mountain View, CA, August 27, 2003 -- The Unicode Consortium and
Addison-Wesley announce publication of Version 4.0 of the Unicode
Standard. Unicode is the fundamental specification for the
representation of text, at the core of all modern software, programming
languages, and standards, including Windows, Java, C#, Perl, XML, HTML,
DB2, Oracle, and many others.


Unicode is also central to the new internationalized domain names, which
allow everyone in the world to have URLs in their own languages. This is
yet another case where Unicode opens the door to more of the world's
different cultures, helping to break down the digital divide.

Version 4.0 strengthens Unicode support for worldwide communication,
software availability, and publishing. The text has been extensively
rewritten, and incorporates specifications that were previously only
available as separate documents. The clarified specification of
conformance requirements incorporates the most highly developed
character encoding model in existence, encompassing the wide variety of
types of characters needed by the world's languages, and permitting
compatibility with all modern computer architectures.

Record-breaking character content 

Version 4.0 encodes over 96,000 characters, twice as many as Version
3.0, and includes two record-breaking collections of encoded characters.
The largest encoded character collection for Chinese characters in the
history of computing has doubled in size yet again to encompass over
2000 years of Chinese, Japanese, Korean, and Vietnamese literary usage,
including all the main classical dictionaries of these languages.
Version 4.0 also encodes the largest set of characters for mathematical
and technical publishing in existence. The character repertoires of
Version 4.0 and International Standard ISO/IEC 10646 are fully
synchronized.

Reducing the digital divide 

To meet the needs of all linguistic communities, the Unicode Standard
and associated standards are continually being extended, not only in
terms of the addition of characters, but also in specifying *how* those
characters work, such as:

- how text sorts or matches in different languages 
- how text behaves for East Asian languages (e.g. vertically) or in
Middle Eastern languages (from right to left) 
- how text should upper- or lowercase 
- how text breaks into lines or words 
- how text behaves in Regular Expressions (a key tool used in a vast
number of web servers) 

Small linguistic communities all over the world have the opportunity to
get mainstream software working right out of the box, instead of waiting
years for special adaptations that may never come.

For more information on the scripts encoded in the Unicode Standard, see
http://www.unicode.org/versions/Unicode4.0.0/

Version 4.0 is published by Addison-Wesley (ISBN 0-321-18578-1), and is
available from the Unicode Consortium or through the book trade. The
text and code charts of Version 4.0 are also available on the
Consortium's Web site www.unicode.org.

About the Unicode Consortium 

The Unicode Consortium is a non-profit organization founded to develop,
extend and promote use of the Unicode Standard, which specifies the
representation of text in modern software products and standards. 

Members of the Consortium are a broad spectrum of corporations and
organizations in the computer and information technology industry. Full
members are: Adobe Systems, Apple Computer, Basis Technology, Government
of India (Ministry of Information Technology), Government of Pakistan
(National Language Authority), HP, IBM, Justsystem, Microsoft, Oracle,
PeopleSoft, RLG, SAP, Sun Microsystems, and Sybase.

Membership in the Unicode Consortium is open to organizations and
individuals anywhere in the world who support the Unicode Standard and
wish to assist in its extension and implementation.

For additional information on Unicode, contact the Unicode Consortium,
650-693-3921 

About Addison-Wesley 

Addison-Wesley (www.awprofessional.com) is the leading publisher of
quality computer science and engineering books and software for
technical professionals, developed and authored by the world's leading
technology experts.  It is a unit of Pearson Technology Group, the
world's largest provider of consumer and professional computer,
information technology, engineering and reference content.  Pearson
Technology Group is an operating 

Re: Unicode in new IRNA site

2003-08-26 Thread Roozbeh Pournader
On Tue, 2003-08-26 at 09:50, C Bobroff wrote:
 Just don't tell me the calligraphers have also joined the band-wagon and
 are now putting the dots!

Fortunately they're not.

You know, everybody who's caring and sane enough to proofread, makes
sure these don't appear on paper (or sometimes on the computer screen),
but again, not all of these people care what it is that's stored in
their computers.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: Unicode in new IRNA site

2003-08-25 Thread Roozbeh Pournader
On Sun, 2003-08-24 at 22:16, C Bobroff wrote:
 But then once you do notice,
 then you REALLY notice. Those two little dots get under your skin and it
 starts to fester. You start to ONLY see the dots and the rest of the
 content becomes a blur. Insanity is imminent. The only solution: Immediate
 surgical removal of the dots using the latest Search/Replace technology.

They do get under the skin, yes, but it's a little worse than that if
you start seeing the two little dots everyday on announcements,
printouts, ads, ... It will get to your {maghz-e ostekhaan}.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: Unicode in new IRNA site

2003-08-22 Thread Roozbeh Pournader
On Fri, 2003-08-22 at 21:49, C Bobroff wrote:
 By the way, this is yet another reason I offer up thanks to Roozbeh,
 Behnam (and others) for the new Persian keyboard as the Arabic Yeh is
 only a convenient Shift key away and makes for much more fruitful Google
 searches!

Well, thank the original designers of ISIRI 2901:1994. They did it eight
years ago without even knowing the problem it's solving for you today.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: Unicode in new IRNA site

2003-08-22 Thread Roozbeh Pournader
On Sat, 2003-08-23 at 03:31, C Bobroff wrote:

 OK, I took up your challenge and just emailed Yahoo. Don't blame me when
 we discover they've replaced Arabic Yeh with CYRILLIC LETTER YA U+042F!

I don't like it. I wasn't challenging you. 'Was challenging the silent
lurkers.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


[Fwd: Unicode 4.0 is online!]

2003-08-14 Thread Roozbeh Pournader
FYI.

-Forwarded Message-
From: Kenneth Whistler [EMAIL PROTECTED]
Subject: Unicode 4.0 is online!
Date: Mon, 11 Aug 2003 17:32:15 -0700

Unicadetti,

The day you have been waiting for has arrived. All of
the pdf for the online version of Unicode 4.0 has been
generated and is now available.

Tune your browsers to:

http://www.unicode.org/versions/Unicode4.0.0/

All of the preliminary chapter postings have been replaced
with the final book content as it is currently being printed
by Addison-Wesley.

In particular, you might want to admire the greatly improved
General Index to the book, brought to you courtesy of
Joe Becker and Joan Aliprand. Not only is the index vastly
expanded and completely rethought, the pdf version has completely
active links to take you back to the text that is indexed.

O.k., now there are no excuses that the standard is not
available. ;-)

--Ken (for the rest of the editorial gang -- particularly
   Eric Muller and Julie Allen, who did the actual
   work for this online posting)
   
   
P.S. We are looking into the pdf generation glitch which
has the pages coming up at too small a size, so that you
have to resize them before reading. When we sort this
out, we'll repost with pages which default to an appropriate
size on loading.


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: [PersianComputing] Koodak font: alpha release

2003-08-10 Thread Roozbeh Pournader
On Thu, 2003-08-07 at 18:45, C Bobroff wrote:
 There is already a font called Koodak. Won't users (and their computers)
 have a problem when they THINK they are seeing this font but it's really
 the old one? It won't occur to them to download the new one.

The point is, this is exactly just *that* Koodak, but only improved with
regard to Unicode compatiblity. FarsiWeb is only fixing already existing
fonts, not providing new designs.

As for user (and computer ;)) education, that's not our expertise. We'll
consider any specific advice.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Fwd: Unicode in XML and other Markup Languages

2003-06-20 Thread Roozbeh Pournader
FYI.

-Forwarded Message-
From: Susan Lesch [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: W3C Weekly News - 19 June 2003
Date: 19 Jun 2003 12:05:30 -0700

Unicode in XML and other Markup Languages Republished

   Updated for The Unicode Standard, Version 4.0, Unicode in XML and
   other Markup Languages has been republished as a Unicode Technical
   Report and a W3C Note. These guidelines cover the use of Unicode with
   markup languages such as XML. They are published jointly by the
   Unicode Technical Committee and the W3C Internationalization Working
   Group and Interest Group. Read about the W3C Internationalization
   Activity.

http://www.w3.org/TR/2003/NOTE-unicode-xml-20030613/
http://www.unicode.org/
http://www.w3.org/International/


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] New keyboard layout for Windows

2003-06-14 Thread Roozbeh Pournader
On Fri, 13 Jun 2003, Linguasoft wrote:

 The question remains why you provide direct keyboard input for
 combining hamza  madda. Are there any letter combinations other than
 with alef/ya/waw that can be created via combination?

Yes. Heh.

 (I've seen accents added in handwriting for Pashto and even Dari!)

Well, I've seen a whole Dari book typeset with a Pashto typewriter and
then an additional slash added to each and every Gaf by hand.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: [PersianComputing] RE: [farsiweb] New keyboard layout for Windows

2003-06-14 Thread Roozbeh Pournader
On Fri, 13 Jun 2003, C Bobroff wrote:

 For whatever they're worth, they're here as PDF files:
 
 http://www.loc.gov/catdir/cpso/roman.html

That only mentions there is only one Kurdish letter not already in 
Persian. But we know a lot of accent marks are used, while the above 
reference only mentions one of the cases in an extra note.

I won't consult it.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] New keyboard layout for Windows

2003-06-13 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, Behnam Esfahbod wrote:

 As Roozbeh suggested, we can put these 3 character in the new layout, but 
 my opinion is that we don't; because they SHOULD NOT use in persian texts, 
 and we have other local shapes for these characters.

No, we don't local shapes for these. These characters are usually used for
their *legal* value. We don't have that notion of Trade Mark or Registered
Trade Mark here, and there is also no need in Iranian law to put a
Copyright symbol anywhere.

Some certain publishers, like kaanoon-e parvaresh-e fekri, have invented 
a Copyright-like symbol and been using it some times, in the shape of an 
isolated Hah (he-ye jim-i) inside a circle. But again, it is not 
standardized, and it has no meaning in any legal circle.

(And if anyone is wondering if we need to have that character in Unicode,
the answer is no, we don't. The reason is left as an exercise to the 
reader!)

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


[farsiweb] Re: [PersianComputing] Persian Keyboard Layout Preview

2003-06-13 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, C Bobroff wrote:

 What is the character on alt+control+d?

It's the Arabic Alef Maksura. For cases you just need a dandaane in the 
middle of a word. Almost always Koranic quotes.

 Or maybe that's supposed to be the tatweel??

No, Tatweel is at Shift+-.

 And forgive my ignorance but when do you use subscript alif? I've only
 used it in Pakistani contexts.

We use it under Yeh sometimes, to mention that this Yeh has an /i:/ sound,
and not an /ej/ sound. The usage is usually educational or Koranic.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: [farsiweb] Re: Persian input with US/European keyboard

2003-06-13 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, Nigel Greenwood wrote:

 If you mean that it's easier to type Persian on a Persian _physical_ keyboard 
 with the Iranian layout, I'm sure you're right.  PerScript is designed for use 
 on US/European QWERTY keyboards, where the keys are actually marked Q, 
 W, E, R, T, Y, etc.  

Well, on thise keyboard I'm typing now, the keys are actually markes Q, W, 
E, R, T, Y, etc. The same is true with many other keyboard in our office, 
or the whole Sharif University. In other words: No Persian labels.

So, in yet other words: I'm typing on a US keyboard, and I prefer my Beh
to be on F instead of on B. So do many others.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] New keyboard layout for Windows

2003-06-13 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, C Bobroff wrote:

 If you don't redefine your concept of easy,

Well, honestly the way it is now in MS software (or even Linux) is not
good enough even for experts. IMO, all OS-es should come automatically
with all languages enabled, or, at the minimum, come with an automatic
update the first time the user opens a document in, say, Chinese or
switches to a, say, Pashto keyboard layout.

 people are going to say it's too hard to bother with this script and
 that's why they advocate romanizing Persian.

I don't care if they can do that properly. If they suggest a sane
mechanism for latinizing Persian. The point is: nobody has ever come up
with a real suggestion, one that considers all the invovled details. They
usually just publish a table and stop there. I may even jump on the train
if they come up with a reference dictionary and a software to convert the
older documents.

 Do you know just to enable FA input on a Windows machine is asking too
 much for newbies?

It is. That is the reason the newbies should have these automatically
installed for them when they buy the machine. Or they should employ
someone to do that for them!!! The golden rule is: If you are a newbie,
know it, don't nag to others that you have a right to be ignorant, and ask
or pay for expert advice. That's what is already happening in the law
world, or the automechanic world, or ...

 I was even joking with someone at MS that a first-time user should be able
 to sit down at the comptuer and say, Please activate Persian and
 automatically FA will be enabled, Word will fire up, nastaliq font 
   ^
No, no Nastaliq font. It's not the default for Persian anymore. People 
have a hard time reading Nastaliq for anything longer than a few words.

 at reasonable fontsize selected and RTL/right-aligned mode on and
 on-screen keyboad at your service!

 Even this probably won't be sufficient...

It won't be. The system should start the Persian support at the first
moment the user starts talking Persian to the microphone.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] New keyboard layout for Windows

2003-06-13 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, Linguasoft wrote:

 These are *combining* Maddah, Hamza Above, and Hamza Below.
 Isn't that what I called deadkeys in another context? (Had no time to
 look into SC Unipad so far to see how exactly they function...)

There is a difference. Dead keys are typed before the base letter. These
are typed after the base letter.

 smart quotes: I see your point, but please see my point too. There are
 people editing bilingual technical manuals (like me) where certain types
 of quotes are mandatory, even for languages like Persian that normally
 prefer another type of quotes (guillemets).

I understand. You have special requirements. But unfortunately, I have no 
clue how to get this fixed.

 I may help you with information from ALA-LC (American Library
 Association/Library of Congress) containing exact lists of characters,
 alongside with standard transliterations, for all languages you are
 interested in.

Well, there are some problems here. We want, say, modern Baluchi script as 
written in Iran. LoC will probably provide us with every Arabic letters 
that has ever been used in any Baluchi. And they may even give us that by 
mistake.

Let me give you an example. There is a certain character in Unicode, a Hah
with two vertical dots over it, and it was mentioned as being a Pashto
letter. We found that it's not used in modern Pashto at all. Unicode
experts said that it comes from the librarians, so it should be used in
older orthographies. Next time we were in Kabul we contacted all the
experts, and found many older letters that were not in Unicode, but not a
single evidence of this certain letter. No expert had ever seen it. Guess
what? It was possibly a mistake by some librarian somewhere, or a letter
just a single author had used.

We don't want these in our set.

 I have no information about frequency, except that for Kurdish, I might
 be able to generate a frequency list from some electronic texts that I
 compile din the past. (There may be more recent webpages as well but I
 will have to look around.)

I won't trust web pages, since they had been done using the limited
technology. But we'd appreciate your Kurdish list.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] Re: [PersianComputing] Persian Keyboard Layout Preview

2003-06-13 Thread Roozbeh Pournader
On Fri, 13 Jun 2003, Linguasoft wrote:

 But doesn't ALEF MAKSURA appear mostly at the end of words, i.e. in its
 final or isolated forms?

It does, but that is the Arabic. A normal Persian Yeh is used in Persian
contexts. For example, both words ali and kobraa should be written
(and encoded in Unicode) using a Persian Yeh.

 What's more, in Arabic, when you add a personal suffix (etc.) to ALEF
 MAKSURA, it will assume medial/initial forms *with dots*...

This is probably a bug in your software, or just it being old. Alef
Maksura was only a right-joining character until Unicode 3.0 (like Reh, 
Dal, ...), and it got changed to a dual-joining character in Unicode 
3.0.1.

 I wonder where you have drawn the border line between Unicode characters
 that are used only in Koranic texts, and other symbols such as
 cantillation marks or calligraphic elements such as U+FDF4, U+FDFA,
 U+FDFB, etc. (these Unicode values are given for reference only, not
 because I advocate making Arabic presentation forms available via direct
 keyboard input).

We don't draw any line. We have just put some of them on the main keys and
the others on AltGr based on frequence and other concerns, based on what a
Persian typist usually sees in her day's pile of work.

 Traditionally, there have been special calligraphic fonts for all these
 add-on characters but they weren't easy to handle. I wonder whether it
 would not make sense to design a special (extended) keyboard for them,
 which may go hand-in-hand with the creation of suitable OT fonts. Are
 there any efforts made in this direction?

Not any that I know of. (Although I don't even know if this is a good
thing to do.)

 Lastly, a question related to the SHIFT+8 key: It's presently ASTERISK
 (U+002A, but wouldn't it be more appropriate for Farsi context to use
 this position for the ARABIC FIVE POINTED STAR (U+066D) symbol, and move
 the ASTERISK somewhere else, e.g. to ALT+8? Strangely, the ARABIC FIVE
 POINTED STAR symbol has *six* points in Arial Unicode MS and *eight*
 points in Tahoma. How comes? :-)

Ok, let's start with a little bit of history: the whole reason there is a 
five-pointed Arabic star, is that some hardline Muslims belived that *any* 
six-pointed star resembles the Jewish *Star of David*, and so insisted on 
using a five-pointed one. This was not only them. Actually the same had 
happened with Jews and a much more frequent symbol, the *plus sign* 
itself. Some hardline Jews considered that a cross, and thus a symbol of 
Christianity. So, even these days, some of the Israeli school books are 
published with another addition symbol, one that looks like a normal plus 
sign with the bottom like ommitted, something like a is perpendicular to 
symobl:

 |
   --+--

In Iran, typographers almost always use the six-pointed star to, say,
separate unrelated paragraphs (where usually three is used). Thus,
fortunately because of the lack of such hardliners (or them being unaware
of this concept), we have the more standard six-pointed one in the typeset
books and on the layout.

An exhausted roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] Re: [PersianComputing] Persian Keyboard Layout Preview

2003-06-13 Thread Roozbeh Pournader
On Fri, 13 Jun 2003, C Bobroff wrote:

 An exhausted but euphoric Roozbeh?

Not euphoric. Not at all. I just feel talkative. The really good word for
what I am now is tired. I need a lot of alcohol, and then a lot of sleep.  
A good Persian word is mozmahel.

 Admit it, you're enjoying every minute!

The visa won't get ready until Monday morning either. So I'm getting more 
frustrated, and I stick more to work. The whole reason I came to office 
today was to read possible emails on what happened with the visa.

And now I'm watching another movie, am postponing a long list of things I 
have to do in order to get the money poured into the project, I'm not 
answering phones, or even my cellphone, ...

I just feel I need to answer the questions, or you, Behnam, and Peter will
go into a deep loop of discussing something based on a wrong assumption.  
I just feel the ultimate necessity to answer.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] Re: [PersianComputing] Persian Keyboard Layout Preview

2003-06-13 Thread Roozbeh Pournader
On Fri, 13 Jun 2003, C Bobroff wrote:

 No, I've alerted all the embassies of the world not to issue you any more
 visas for conferences.

It should have been you then :))

 Look how much we all have profited from the fruits of your visa
 frustations of the past few days--

 a very nice keyboard, installation instructions + documentatin

The layout is the outcome of a meeting. I was just a member. If you mean
the software, it took about half an hour or a little more because of the
nice MS tool for its creation.

Oh, while we're at it, would you tell your MS friends to all a ZWNJ on 
Shift+Space with their tool? I went through many tricks to get it done, 
but the keyboard compiler catches me at the final second always.

 and so many questions answered! Your sacrifice is GREATLY appreciated!

That is something now :)

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] Re: [PersianComputing] Persian Keyboard Layout Preview

2003-06-13 Thread Roozbeh Pournader
On Fri, 13 Jun 2003, C Bobroff wrote:

 Yes, that's what I meant and it took YOU half an hour but would have taken
 me and the silent lurkers weeks or possibly never so thank you.

I don't believe it. Full stop.

 And did I hear you say, nice MS tool? Hmmm

It's a nice tool. But it's a shame that it's not shipped with MS Windows,
and it's a shame that it came s late.

 But the NICE TOOL doesn't recognize ZWJ or ZWNJ to be spaces.  (The space
 bar is only for spacing characters.) Maybe your friends at Unicode haven't
 properly labelled it so the NICE TOOL can tell what it is??

The question is: Why is the space bar only for spacing characters? Who 
requires that? And why? (I can name a few pieces of software for Windows 
that can help you assign a ZWNJ to Shift+space and don't nag. So this is 
not a Windows *requirement*.)

My friends at Unicode are labeling it as a control character, which it 
really is. Your friends at Microsoft should also allow control character 
for space, or tell us why they can't.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: [farsiweb] IE crash code! (fwd)

2003-06-12 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, Behnam Esfahbod wrote:

 take a look at this mail.
 
 you can find the example at:
 - http://esfahbod.info/proj/web/test/ie/crash.html
 
 i'd tested the page with IE6 SP1 (latest microsoft update), and it crashed 
 too!

I'm sorry, this is completely off-topic. Please stop sending completely
unrelated messages to the list. Find other places for messages on general
Windows security or whatever.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] New keyboard layout for Windows

2003-06-12 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, Linguasoft wrote:

 Thanks for your efforts to provide us with an experimental version of
 the new standard keyboard layout for Persian !

You're welcome Peter. But please don't propagate it much, since that may
be changed.

 I tried the keyboard in Word2000/Win2000, using Arial Unicode MS which
 displays all glyphs that can be generated via the keyboard except Riyal
 sign and Subscript alef.

Rial sign is Unicode 3.2. Subscript Alef is Unicode 4.0. Microsoft has not 
enough time to implement them for you.

 I am not quite sure in which context standalone versions of maddah,
 hamzah above and hamzah below are used, but assume they are there
 because they are in the Unicode standard.

They are not the standalone ones. These are *combining* Maddah, Hamza 
Above, and Hamza Below. But since these were only encoded in Unicode 3.0, 
and Windows/Office 2000 only handles pre-2.1 characters properly, they 
appear as standalones ones to you. Try the keyboard with SC Unipad, for 
example, and you'll see.

 Standard shortcuts of Word for C, R, and T also work with the Persian
 keyboard.

Interesting news. I didn't know about them at all.

 What does not work is Word's AutoCorrect option for smart quotes, i.e.
 neither quotation mark (U+0022) nor apostrophe (U+0027) are converted
 into their smart equivalents; I wonder if this feature is
 keyboard-(dll)-related but if it is, I suggest to implement it as well
 as many users, especially in bilingual context, may want to use
 typographically correct English quotation marks.

We have no clue how to fix that even if that is a desired effect. For me,
that would be undesired. The quotation mark and the apostrophe (and a few
others) were only added to help manual entry of rich text (like TeX, XML,
and HTML) in a text editor without having the need to switch the layout
very often.

 How would you input ZSNJ, and RTL/LTR markers with the new keyboard?
 (These special characters aren't mentioned in keyboard.png as well.)

What is ZSNJ? If you mean ZWNJ, it is Shift+B.

I also don't know what you mean by RTL and LTR markers. If you mean the
Bidirectional control characters, they are at AltGr+9,0,I,O,P,[,].

 I also wonder whether there is any accepted standard to show alef
 maqsura on keytops. In keyboard.png, you use an initial shape of ya
 without dots which may be misleading; how about using isolated ya
 without dots but with a superscript alef (I remember this was a keytop
 inscription on a keyboard for an Arabic/Persian typesetting machine that
 I use many years ago...)

In a Persian context, you should only use Alef Maksura (a.k.a the dotless 
Arabic Yeh) only in initial and medial form. This is required when one 
needs to type a few Koranic quotes.

An isolated Yeh without dots with a superscript Alef over it should be
typed using the Persian Yeh and then a superscript Alef (D, Shift-V).

 Are there any decisions as to support other regional languages such as
 Kurdish or Azeri?

The general attitude in the commitee is to support those languages if
enough information is provided. We need to know about the exact list of 
characters each need, and their estimated frequency/importance.

 If the ultimate goal is to support several languages using an extended
 Arabic glyphset via one and the same keyboard, my feeling is that some
 Shift or Alt key positions may have better been reserved for special
 characters of these languages, or defined as deadkeys to create certain
 accented characters (as in case of the US International keyboard).

We are not trying an extended Arabic set. But we'd love a layout that is
able to support all major minority languages of Iran, although optimized
for Persian. We may even try to create layouts optimized for them if we
can find the expertise.

But we are not interested in Pashto, Sindhi, or Urdu at all, while we are
very interested in Azeri, Kurdish, Baluchi, ...

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] New keyboard layout for Windows

2003-06-12 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, C Bobroff wrote:

 In a textbook, you might want to say, This here is a maddah.  In the
 past, I wanted to show what a superscript alif compared to fatha looks
 like and was not able to

You should put them either over a space, or a Tatweel (U+0640, the base
line extender that looks like a '_').

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] New keyboard layout for Windows

2003-06-12 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, Linguasoft wrote:

 No other keyboard I know for extended Arabic languages provides keytop
 positions for standalone versions of maddah, hamzah above and hamzah
 below, although it might make sense to use these keys as deadkeys to
 type compounded glyphs alef-madda, alef-hamza, waw-hamza, etc., in order
 to have keytop positions that are presently occupied by these compounds
 free for other characters or symbols.

That is a limitation of the software you are using. These are combining 
ones.

 For comparison, European keyboards or the US-International keyboard also
 do not include standalone versions of all accents, and use many keys
 (accent keys and others) with a deadkey function to generate accented
 characters.

Just for the record, I oppose any deadkey mechanism for any Arabic
script keyboard layout. The notion is rather complicated, and is only 
familiar to the Europeans. Asian people are used to live keys instead, the 
ones that appear after the letter.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


RE: [farsiweb] New keyboard layout for Windows

2003-06-12 Thread Roozbeh Pournader
On Thu, 12 Jun 2003, C Bobroff wrote:

 Just over a space is fine but the font should be able to render it and the
 fontmakers don't always know what all people may want to type.

That's some other matter.

 If the fontmakers see it's a character on the keyboard, they might make
 an isolated form.

There is no need for an isolated form. The rendering engine (the program 
that puts the glyphs in the font on the screen next to each other) is 
supposed to render that.

 Then if the user can type anything and everything desired, great stuff
 can be written in Persian and we can stop this jpeg/gif/latin
 transliteration business!

We are also trying to get there. Only the details in the path we choose
are a little different.

 Best to make it as easy as possible to type everything!

Depends on how you define easy. Try!

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


[farsiweb] Testing, please ignore

2003-05-31 Thread Roozbeh Pournader

This is a test to check that the list are now up and running. Please 
ignore.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


[farsiweb]Announcement: New Unicode Savvy Logo (fwd)

2003-05-27 Thread Roozbeh Pournader

FYI.

-- Forwarded message --
Date: Tue, 27 May 2003 13:49:24 -0700
From: Magda Danish (Unicode) [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Announcement: New Unicode Savvy Logo

Dear Unicoders,

Very often the Unicode Consortium has received requests from webmasters
who wished to indicate with a logo or banner that their site supports or
uses Unicode. For such purposes we have developed two logos that can be
freely displayed on web sites. You can use a Unicode Savvy logo to
indicate that a page (or collection of pages) is encoded in Unicode. 
To learn more and to obtain an image of these logos, please refer to
http://www.unicode.org/consortium/unisavvy.html.

Thank you and best regards,

Magda Danish
Administrative Director
The Unicode Consortium
650-693-3921


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: [farsiweb]Kurdish Language

2003-04-05 Thread Roozbeh Pournader
On Fri, 4 Apr 2003, Shervin Afshar wrote:

I believe that Kurdi language has not a written
 form and it uses farsi script.

No, you're wrong. It indeed has a written form and has some special
letters only used in Kurdish. I can't point to a specific resource (I am 
indeed searching for experts), but I have seen Kurdish books.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


[farsiweb]Unicode Advertisement

2003-04-05 Thread Roozbeh Pournader

Hong Kong [Special Administrative Region] government is advocating
ISO 10646 (a.k.a. Unicode) by creating flash animations:

http://www.info.gov.hk/digital21/eng/images/cli/iso.swf

Funny!

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


[farsiweb]Unicode character names (was Re: Unicode Advertisement)

2003-04-05 Thread Roozbeh Pournader
On Sat, 5 Apr 2003, C Bobroff wrote:

 If unicode is so scrupulously attentive to details of standardization,
 why is the naming scheme so haphazard?

Because of very tight constraints set by ISO, and a requirement of ISO
that the names stay the same forever, even if mistakes are found in them.  
Standards need to guarantee stabilities to some degree in order to be
implemented, and character names looked one of the promising cases.

 The names of the Arabic letters are based on their Arabic *colloquial*
 names (yeh instead of ya, heh instead of ha).

The naming system is here because of the need for backward compatibility.  
Actually, ISO defines two characters in two different standard character
sets to be the same thing if their name is exactly the same. So, all the
character names got inherited automatically from ISO 8859 series of
standards. ISO 8859-6 (Arabic/Latin) used those names (I don't know why),
so ISO 10646 and Unicode inherited them directly.

 For example, Arabic letter Farsi Yeh. And the use of Farsi hasn't
 been fixed after all the learned debates??

Once again, the letter was named like that in some old ISO standard about
extended Arabic letters, and the name stuck.

ISO and Unicode Consortium both use Persian when they refer to the name
of the language. Farsi is sometimes used in the parentheses, to tell
those who don't know about the politics involved to know that these are
the same thing.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: [farsiweb]Using persian in the website

2003-03-25 Thread Roozbeh Pournader
On Tue, 25 Mar 2003, M wrote:

 The problem is that I am a anti-microsoft prophet :-)

Oh, you want Linux software?

Red Hat 8.0's GNOME editor (gedit) and KDE editors (Kate and Kedit?)
support Persian editing (with a few bugs), and so does yudit
(http://www.yudit.org/). Latest vim also supports Unicode, but with
Left-to-right Persian/Arabic.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: [farsiweb]YEH problem decision [OFFTOPIC]

2003-03-19 Thread Roozbeh Pournader
On Tue, 18 Mar 2003, Masoud Sharbiani wrote:

 You can't revoke a license by not just providing the software anymore.

 Yes you can. That was exactly what happened to OEone, where FSF 
 prohibited us from providing bash, glibc (and therefore all of our 
 distro, since we used glibc) because of a breach we did from GPL (it is 
 now resolved).

FSF people didn't revoke a license themselves. The license automatically
got revoked because of the breach. This is mentioned in clause 4 of GPL:

  4. You may not copy, modify, sublicense, or distribute the Program 
  except as expressly provided under this License. Any attempt otherwise 
  to copy, modify, sublicense or distribute the Program is void, and will 
  automatically terminate your rights under this License. However, parties 
  who have received copies, or rights, from you under this License will 
  not have their licenses terminated so long as such parties remain in full 
  compliance.

 So, in practice what happens is that MSFT's lawyers would contact the
 owner of that SF project, and tell them to 'cease and desist' since
 still they own the distributed material, and they can change terms of
 service, much like a leased car, or apartment. Just like that.

I can't agree. Not if the SF guys follow the license (OEone had not
followed the license). But I agree that the SF guys may stop distributing
the fonts if MSFT starts to threaten them a little.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: [farsiweb]YEH problem decision

2003-03-14 Thread Roozbeh Pournader
On Fri, 14 Mar 2003, C Bobroff wrote:

 Speaking of Windows, I have heard unsubstantiated reports that even with
 WinXP, the yeh (06CC--on the d key) works great in some editing
 programs such as Word but even on the same computer, fails to work in
 other programs.

That will be correct if you replace with when using other fonts. The Yeh 
problem is font-specific, and if I recall correctly, some of the Windows 
XP's Arabic fonts still have that bug.

 Also, it's true that the fonts have been corrected concerning the yeh.
 However, the people in the Microsoft Typography dept who have removed the
 free font downloads don't seem to know that there is no way to legally
 obtain the corrected fonts (unless they upgrade.)

They may not know, but there *is* a way to legally obtain the corrected
fonts. For an example, please see:

http://corefonts.sourceforge.net/

 Just curious: What if you are searching for an Arabic  word but you are
 inputting your search in FA mode (maybe you are more familiar with the
 keyboard layout, etc.)  Since some of the yeh forms in Arabic and Persian
 are identical (in appearance) but have different unicode values, will this
 not adversely affect your search?

This will affect your search and find all the other Yehs also. In the case 
this is not desired, you should be able to turn off this magic as an 
option, like the uppercase/lowercase searching option in many search 
boxes.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


[farsiweb]Re: kbdfa.dll in XP

2003-02-16 Thread Roozbeh Pournader

Dear Mr Rezaei,

We don't support Windows XP. You can test it for yourself to see if it
works for you (some people could make it work with XP and some others
couldn't). About the p == ~ thing, FarsiWeb's KBDFA.DLL is based on
Iranian national standard ISIRI 2901. You can read ISIRI 2901 at:

http://www.isiri.org/std/2901.htm

-- 
The FarsiWeb Project Group
http://www.farsiweb.info/
mailto:[EMAIL PROTECTED]

On Sun, 16 Feb 2003, mehdi rezaei wrote:

 i download KBDFA.DLL from farsiweb.
 
 can i type farsi(with farsi keyboard  p == ~) in xp windows
 
 (aya dar windows xp ham mitavan az in dll estefade kard)


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



[farsiweb]Test

2003-01-27 Thread Roozbeh Pournader

This is just for testing purposes. Please ignore this.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



[farsiweb]ISIRI 6219:2002

2002-11-19 Thread Roozbeh Pournader

We just got our hand on the published copy of the Iranian National
Standard

  ISIRI 6219:2002  Information Technology -- Persian Information 
  Interchange and Display Mechanism, using Unicode

It is dated November 2002, and is about viii+33 pages.

The standard is mainly guidelines for encoding Persian texts in Unicode,
and tries to solve some of Unicode ambiguities in handling the Persian
language and the Arabic script locally.

An unofficial online version (which is exactly the standard minus its
cover) is available from:

   http://prdownloads.sourceforge.net/farsitools/finalversion.pdf?download

A paper copy may be acquired at the price of 4125 IRR from:

  Institute of Standards and Industrial Research of Iran
  PO Box 31585-163
  Karaj, IRAN
  Fax: +98 (261) 280-7045

Roozbeh Pournader,
for the FarsiWeb Project Group

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



Re: [farsiweb]Re: unicode fields in database

2002-11-12 Thread Roozbeh Pournader
On Sun, 10 Nov 2002, C Bobroff wrote:

 Instead of someone new asking every 3 months how can I sort Persian?,

The whole problem is that people post questions without looking at the
archives, or searching the Google. They look at the internet as an Oracle,
instead of a knowledge base.

 why doesn't one of you who is technically astute please compile all these
 sorting patches and put them in one place with a description of bugs and
 features of  each and some step-by-step instructions on how to implement
 them?

Sure that will be good. I will be able to put any such HOWTO on the
FarsiWeb web site, if that goes with the style of our other HOWTOs there,
and doesn't recommend anything insane (read very non-standard).

 The younger generation is going to get accustomed to looking at the
 end of the alphabet for certain  letters and not be sure whether heh or
 vav comes first so better hurry!

You're over-exaggerating. That won't happen in Iran. The young generation
are clearly taught about the main order. It's the secondary order (Hamzas,
Teh Marbuta, ...) that is still under debate.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



Re: [farsiweb]unicode fields in database

2002-11-10 Thread Roozbeh Pournader
On Sat, 9 Nov 2002, FK wrote:

 All these problems with the Arabic script makes me a believer in
 changing our alphabet to a Latin-based one and getting rid of all these
 unnecessary headaches :-)

The whole point is: all other languages have similiar problems, including
those written in Latin. The main difference, is usually that there is not 
enough market for Persian abilities in software.

Just some examples: Swedish sorts 'v' and 'w' as one letter. Turkish and
Azeri have different capitalization rules than the other languages, since
they have two different 'i's. Slovak treats 'ch' as a single letter sorted
between 'c' and 'd'. Hungarian sorts 'czz' as it were 'czcz'. French needs
a ligated 'oe' letter, which does not even exist in ISO-8859-1, ...

IMHO, the Latin script will not solve the problems, it just adds another 
complexity dimension.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



Re: [farsiweb]unicode fields in database

2002-11-09 Thread Roozbeh Pournader
On Fri, 8 Nov 2002, Nasiri2 wrote:

 Do you use Persian unicode fields in any database? How do you sort these
 fields while the letters Gaf,Cheh,Peh, and Zheh are not in correct
 order?

Your database or your programming language should provide the sorting
mechanism, or you should implement sorting yourself (as a database engine
plug-in if it supports it, or in the programming language if it doesn't).
We already have solutions for C, PHP, and PostgreSQL, but all are
Linux-only, or possibly Cygwin (port of GNU tools to Windows).

 How do you represent these fields in HTML or ASP pages
 (converting them to HTML entity code dynamically, saving their entity
 codes, or whatever else)?

As far as I can tell, you HTML doesn't have any notion of fields. If you 
mean the letters Peh, Gaf, Farsi Yeh, etc, they can be referenced in HTML 
pages as #x+hexcode+;. Farsi Yeh becomes #x06cc;, for example.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



Re: [farsiweb]SOS

2002-11-09 Thread Roozbeh Pournader
On Fri, 8 Nov 2002, Mohsen Pahlevanzadeh wrote:

 But i don't know that how type farsi in KDE or GNOME.

Use Red Hat 8.0's KDE and GNOME. GNOME's 'gedit' support Persian, and also
some of the KDE text editors (I can't remember which do). The Persian
keyboard layout is also included in Red Hat 8.0 if you install it, but you 
may need to install fonts of your own, which you may get from your Windows 
fonts directory, or download some from:

http://corefonts.sourceforge.net/

I also have a post about enabling the Persian keyboard in Red Hat 7.3's 
KDE at:

http://lists.sharif.edu/pipermail/farsikde/2002-June/000369.html

But I don't know if exactly the same procedure will work with Red Hat 8.0.

roozbeh


___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



Re: [farsiweb]basic question

2002-07-02 Thread Roozbeh Pournader

On Tue, 2 Jul 2002, Saber XP wrote:

 thanks for your reply.
 why i see your presentation show with error in medical YEH? what's the
 problem with my font(s)? I've to update them?
 (see attachment)

Your fonts have bugs with Persian Yeh then. Try these:

http://www.farsiweb.info/font/parsa.zip

(Note that these are not standard compliant, they only fix the Persian Yeh
bug.)

   1- what's differences between U+06CC and U+0649 ? which one is default
 for
   YEH?
 
  U+06CC is the Persian Yeh. U+0649 is the Arabic Yeh that always has two
  dots under it, even in final and isolated forms.
 but both of them are dotless in Tahoma font, only U+064A have dots.

Oh, My mistake! U+0649 in the Arabic Alef Maksura, and it should be
dotless in all its forms (appearing as a single dandaane in medial and
initial forms). Until Windows 2000, Microsoft's OpenType engine treated
U+0649 as a right-joining letter only. I don't know what is the exact case
about Windows and Office XP.

   3- is tahoma the only standard font that supports uni code?
 
  That depends on how one defines a standard font.
 I mean the fonts that are built in in windows, not like NESF. and thanks for
 letting me know courier supports unicode.
 what about arial? and medical YEH problem in it.

These Microsoft fonts from Windows 2000 support Persian letters of
Unicode. That doesn't mean they don't have bugs, or they support is
complete or standard: Times New Roman, Arial, Courier New, Tahoma,
Traditional Arabic, Arabic Transparent, Arial Unicode,

But they are more standard than Nesf. Nesf is only a temporary hack.

 I want to know except true type fonts are there any kind of fonts on the
 windows platform.

Actually, OpenType is a superset of TrueType, and the politically correct 
term is that you should use OpenType from now on: TrueType is a dead 
specification. But Windows also support bitmap fonts (.FON) which can't 
support Persian Unicode. You can also get PostScript fonts if you 
install Adobe ATM, but PostScript fonts are worse for Persian.

 you may answer this email in mailing list, I avoid sending it to mailing
 list because there is an attachment.

Did so. :)

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



Re: [farsiweb]FARSI HEH WITH YEH ABOVE (THE THING)

2002-06-18 Thread Roozbeh Pournader

On Tue, 18 Jun 2002 [EMAIL PROTECTED] wrote:

 So, I would like to comply with the current standard, but am not able to
 find the U+0654 on any of the Unicode fonts that I know which are Arial
 Unicode MS, Tahoma, and Microsoft San Serif. Would someone please let me
 know of a standard Unicode font that will provide me with such char?

Unfortunately I don't have a list, but we are working on some fonts which
will definitely support it. But I don't guess you will be able to get them
working on Windows 2000 as Windows 2000's layout engine just supports
Unicode 2.0. I am working on finding a mechanism for replacing the engine
with a newer one, also from Microsoft, that supports the HAMZA ABOVE
character.

Please note that trying to be standard-compliant on a platform with broken
support, and without being able to fix it, is very hard. My recommendation
is use anything that suits your users best, if you can't fix the
underlying platform or write a layout engine for your application. Having
a piece of *working* software is usually better than having a broken
standard-compliant one.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



Re: [farsiweb]heh + hamzeh

2002-06-02 Thread Roozbeh Pournader

On Sun, 2 Jun 2002, C Bobroff wrote:

  Ok, it seems that we are seeing a lot of monolouges here.
 I'm sure more people than just me are finding the monologues educational

I just wish to emphasize that I have seen repetitions of the same concern. 
And I can't forget referring to some of us as dictators or things like 
that. (At least as people who try to impose their ideas on others.)

Regarding the dictatorship things, I wish to emphasize that the matter of
Heh+Hamza was also discussed at the ISIRI meeting for approval of the
standard, and all of the experts agreed or got convinced. The list
includes Dr Mostafa Asi (A computational linguist also working with
Farhangestan), Mr Ebrahim Mashayekh (President of Informatics Society of
Iran), Dr Mohammad Ghodsi (Project Leader of FarsiTeX), Mr Mohammad
Azadnia (Technical manager of Persian project at Iran Communication
Research Center), Mr Arash Rezaiizadeh (one of entrepreneurs of Windows
Farsification), and Mr Arash Zeini (President of Chapar Shabdiz, the first
Iranian Free Software company, also of FarsiKDE fame), and Mr Hashemi (Gam
Electronic's Persian Expert).

All other known experts, if present in Iran, were invited, but some could
not attend: this includes people like Dr Mohammad San'ati of SinaSoft
fame, whom Behdad and me met personally after the meeting, to make sure he
does not have major objections.

I can't understand who Abi was refering to, when she or he writes Next I
expect we will be told how to combe out hair. [...] They have nothing to
offer to the Persain IT and language discussion. Was he refering to me,
or to Mr Khanban? (We are both members of the technical committee of the
standard you heard a lot about.) To say the least, neither me nor Mr
Khanban have anything to hide about what we have done for the Persian IT
world: Just search Google for Khanban or Pournader. We both use our
real and full names, and have done everything publicly. But who is Abi 
Lover?

Also, quoting Abi's exact words, she or he is against any standardization:  
There are some people [...] who think that they have a duty to lay down
rules for other people to follow. Unicode Consortium is doing this. ISO
is doing this. W3C is doing this. Many software companies, from Microsoft
to SinaSoft also do this, by creating things that will become de facto
standards. You are not obliged to follow standards, but you will come to
trouble if you don't. Noone will be able to use your software with other 
software.

 Roozbeh, can you please tell us about this normalization and why
 the mention of Persian is to be removed from this character?

Sure. I have explained the problem a number of times, and I will explain
it again:

There is a notion in Unicode, called Normalization. You can read about it
at http://www.unicode.org/unicode/reports/tr15/. If you don't have the
time, I will brief you in short: Since Unicode is not just for displaying
the text, but also for processing, and it sometimes has different
alternatives for encoding the same text, you need to have some mechanism
to find that two strings of characters are actually the same.

One example, is the equivalence of U+0624 ARABIC LETTER WAW WITH HAMZA
ABOVE, with the string U+0648, U+0654 which is ARABIC LETTER WAW,
ARABIC HAMZA AOBVE. The algorithm is intelligent enough so it can detect 
the equivalence even if you put a FATHA between the WAW and the HAMZA, so 
WAW WITH HAMZA ABOVE, FATHA will be equal to both WAW, FATHA, HAMZA 
and WAW, HAMZA, FATHA.

This equivalence is very important for security issues, and proper
functioning of the software, but I won't get into the details. To say the
least, this is an important part of the two most awaited standards, which
are still a draft: Internationalized Domain Names,

http://www.ietf.org/internet-drafts/draft-ietf-idn-idna-09.tx

where applications MUST do normalization before doing name lookup for 
a non-ASCII domain name, and Character Model for the World Wide Web,

http://www.w3.org/TR/charmod/

where all web authoring or web content generation software is REQUIRED to
normalize the text of a web document before putting it on the wire.

Getting back to our U+06C0 ARABIC LETTER HEH WITH SMALL YEH ABOVE, this 
letter is specified to be equal to U+06D5, U+0654, which is ARABIC 
LETTER AE, ARABIC HAMZA ABOVE. This AE things, is a letter similiar to 
HEH in shape, but only used in Final and Isolated forms, something like 
U+0629 ARABIC LETTER TEH MARBUTA but without the dots. (I think that 
everyone agrees that this AE letter has no place in Persian.)

Now let's consider the real sitation: one likes to encode this ezaafe  
thing. He may look at the charts, and he will either choose U+06D5, or
U+0647, U+0645 (HEH, HAMZA ABOVE), based on his preference for
precomposed or decomposed forms. Let me say that you choose the first,
and I choose the second. The sad point will be that no Unicode compliant
application will be able to tell you that these string are 

Re: [farsiweb]Re: Farsi heh + hamzeh

2002-05-30 Thread Roozbeh Pournader

On Thu, 30 May 2002, Abi Lover wrote:

 There are some people in the Persian IT and linguistics debate who think 
 that they have a duty to lay down rules for other people to follow. At first 
 we were told how to write the ezafeh. Now we have been told how to write 
 the hamzeh. Next I expect we will be told how to combe our hair. Such 
 people should go and offer themselves up as a candidate for the vela^yate 
 faqih. They have nothing to offer to the Persian IT and language discusion.

Would you please be more explicit, and provide a list of names?

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



Re: [farsiweb]ask for unicode

2002-05-21 Thread Roozbeh Pournader

On Mon, 20 May 2002, mohammad mohebbi wrote:

 i store farsi names in a field of SQL server table, if i sort this field
 , will sorted by farsi order?

As far as I know, the answer is no. Arash Rezaiizadeh and me are trying to 
convince Microsoft to add a Persian sorting table, but before that, you 
should sort Persian with your own code.

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



[farsiweb] Opting out

2002-04-21 Thread Roozbeh Pournader


Just to mention that I have unsubscribed from some of international
mailing lists related to I18n, so I may not be able to monitor all
discussions there. It would be great if interested people here can
subscribe to some of them and tell the related Iranian list (farsiweb,
persiancomputing, linuxiran, farsikde) if they came to something important
(or even forward the email to me, if you think I can provide a useful
comment). The current list is:

 * [EMAIL PROTECTED] (Translations for Red Hat)

 * [EMAIL PROTECTED] (I18n for development version of 
   Mandrake)

 * [EMAIL PROTECTED] (Tavultesoft Keyman for making keyboard 
   layouts)
 
 * [EMAIL PROTECTED] (MathML in Mozilla)

roozbeh

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb



[farsiweb] Standard keyboard DLL for XP

2002-04-09 Thread Roozbeh Pournader


FYI.

-- Forwarded message --
Date: Tue, 2 Apr 2002 17:11:34 +1000
From: Saied Tahaghoghi [EMAIL PROTECTED]
To: The FarsiWeb Project [EMAIL PROTECTED]
Subject: Standard keyboard DLL for XP

[...]

I previously used your keyboard DLL for win2k;  yesterday I tried it
out on XP-Pro, with success.

The steps are pretty identical, with the exception of Step 9:


Copy the file to C:\Windows\System32

[Confirm File Replace]
This folder already contains a file named 'kbdfa.dll'
Would you like to replace the existing file ...?

-Select Yes.

[Windows File Protection]
Files that are required for Windows to run properly have been replaced
by unrecognized versions.  To maintain system stability, Windows must
restore the original versions of these files.

Insert your Windows XP Professional CD-ROM now.

- Select Cancel

[Windows File Protection]

You chose not to restore the original versions of the files.  This may
affect Windows stability.  Are you sure you want to keep these
unrecognized file versions?

- Select Yes


Yours,
Saied.

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb