Re: PersianComputing Digest, Vol 12, Issue 34

Eva Braiman Wed, 02 Jun 2004 07:01:54 -0700

Dear fiends,

Thanks so much for the posting on Mac Persian word processing for the mac. I have had great success using all the tips I have gotten from this list.

For example, I was able to salvage a Farsi manuscript with the following steps:

1. Take a DOC file created on an old Windows 98 (Arabic) machine running Parsa 99 and Zarnigar 97, copy it to the Mac (OSX 10.3.4) 2. Open the DOC file in Word (where the characters turn to gibberish) and save the file as an RTF 3. Using Mellel, with Persian ISRI keyboard and B Yagut font, import the RTF file and it looks nearly (a few yeh and alignment problems, but no big deal) perfect!

Yay!

Eva Braiman

On May 26, 2004, at 3:30 AM, [EMAIL PROTECTED] wrote:

Send PersianComputing mailing list submissions to
        [EMAIL PROTECTED]
To subscribe or unsubscribe via the World Wide Web, visit
        http://lists.sharif.edu/mailman/listinfo/persiancomputing
or, via email, send a message with subject or body 'help' to
        [EMAIL PROTECTED]
You can reach the person managing the list at
        [EMAIL PROTECTED]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of PersianComputing digest..."
Today's Topics:
   1. Re: LeapYears of Iranian Calendar (Roozbeh Pournader)
   2. Re: LeapYears of Iranian Calendar (Roozbeh Pournader)
   3. RE: Miscellaneous web issues (Ehsan Akhgari)
   4. RE: Miscellaneous web issues (Ehsan Akhgari)
   5. RE: Miscellaneous web issues (Roozbeh Pournader)
   6. RE: Miscellaneous web issues (Roozbeh Pournader)
   7. Re: Mac info for Persian (C Bobroff)
   8. RE: Miscellaneous web issues (Behdad Esfahbod)
----------------------------------------------------------------------
Message: 1
Date: Tue, 25 May 2004 15:48:40 +0430
From: Roozbeh Pournader <[EMAIL PROTECTED]>
Subject: Re: LeapYears of Iranian Calendar
To: "Ordak D. Coward" <[EMAIL PROTECTED]>
Cc: PersianComputing <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain
On Tue, 2004-05-25 at 01:40, Ordak D. Coward wrote:
I downloaded and tested a few dates with the Win32 executable of
Jalali (the one at sourceforge). The bad news is that, the conversion
is not correct.
The conversion is wrong for 20 March 2005, and similarly a few other
dates that should convert to 30 Esfand Year YYLP, instead all such
dates convert either to 1 Farvardin YYLP or 1 Esfand YYLP, depending
on how the date os set to 20 March 2005. The good news is that, the
jalali.c source does convert such dates correctly.
Thanks for telling us. We forgot to update the MS Windows executable.
roozbeh
------------------------------
Message: 2
Date: Tue, 25 May 2004 15:52:11 +0430
From: Roozbeh Pournader <[EMAIL PROTECTED]>
Subject: Re: LeapYears of Iranian Calendar
To: "Ordak D. Coward" <[EMAIL PROTECTED]>
Cc: PersianComputing <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain
On Tue, 2004-05-25 at 05:03, Ordak D. Coward wrote:
Farsiweb should prepare -- if that is in the scope of FarsiWeb's work
-- a draft of a recommended practice for implementing date conversion
involving calendars used in Iran. This document will of course change
over time, as long as better conversion methods are derived.
This is in the interest of FarsiWeb, but we don't have the time currently. It seems that you have done some deep looking into the subject. Why don't you write it? I'm sure you can write it from a better perspective, and both the FarsiWeb staff and the PersianComputing community can provide you with comments.
Both FarsiWeb and Connie can provide hosting or links, I'm sure.
roozbeh
------------------------------
Message: 3
Date: Tue, 25 May 2004 17:43:47 +0430
From: "Ehsan Akhgari" <[EMAIL PROTECTED]>
Subject: RE: Miscellaneous web issues
To: <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
What is notepad? A text editor? Text editors should not insert a UTF-8
BOM either. The problem is that Microsoft sometimes invents
non-standard things and then pushes it so hard that Unicode adds it to
parts of the standard (or an FAQ). "Microsoft conventions for .txt
files" in the Unicode FAQ looks sarcastic to me.
Well, maybe you're right, but I don't see how a text editor is supposed to know the encoding of a file without some kind of mark. See, HTTP transfers the character set using the Content-Type response header. In HTML, it's spedified with a <meta http-equiv="Content-Type" ...> tag. In XML, the default encoding is UTF-8, and if a document is encoded in another encoding, it must be specified in the <?xml ?> PI. Plain text files have no means of identifying the character encoding, so a single text file can be interpreted as UTF-7, UTF-8, UTF-16, UTF-32, etc. if there's nothing to declare the exact character encoding used.

The point here is that, protocols which do not allow BOM are those who provide other means of specifying the character encoding. A certain byte stream can have multiple interpretations depending on what content encoding you use to interpret it, and there must be some way to cut off this confusion.
YMMV,
-------------
Ehsan Akhgari
Farda Technology (http://www.farda-tech.com/)
List Owner: [EMAIL PROTECTED]
[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]
------------------------------
Message: 4
Date: Tue, 25 May 2004 17:43:47 +0430
From: "Ehsan Akhgari" <[EMAIL PROTECTED]>
Subject: RE: Miscellaneous web issues
To: <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
Thanks for the links. Seems like a very handy keyboard.
BTW, why the
Shift-Space combination does not work?
Bug in Microsoft keyboard layout creation tool. Use "Shift-B"
temporarily.
Thanks.
I've not done any work in this arena, so what I propose here might make no sense. Sorry if that's so. But, the M$ page on the keyboard layout creation tool says the tool "simplifies" the process of creating a keyboard layout. Would there be any way to assign ZWNJ to Shift+Space by coding the keyboard layout tool manually? If you can send me the C/C++ source file off-list, I'll try to investigate it further.

If not, I guess Shift+B is not that bad as well. The keyboard layout rocks, even without having Shift+Space in place. :-)
-------------
Ehsan Akhgari
Farda Technology (http://www.farda-tech.com/)
List Owner: [EMAIL PROTECTED]
[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]
------------------------------
Message: 5
Date: Tue, 25 May 2004 19:25:15 +0430
From: Roozbeh Pournader <[EMAIL PROTECTED]>
Subject: RE: Miscellaneous web issues
To: Ehsan Akhgari <[EMAIL PROTECTED]>
Cc: Persian Computing <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain
On Tue, 2004-05-25 at 17:43, Ehsan Akhgari wrote:
Would there be any way to assign ZWNJ to Shift+Space by coding the keyboard layout tool manually? If you can send me the C/C++ source file off-list, I'll try to investigate it further.
There is no C/C++ source file. The source is a data file that MSKLC compiles into the DLL. If the data file contains ZWNJ on shift-space, it fails to compile. Microsoft developers confirmed that this is a bug.
roozbeh
------------------------------
Message: 6
Date: Tue, 25 May 2004 19:33:50 +0430
From: Roozbeh Pournader <[EMAIL PROTECTED]>
Subject: RE: Miscellaneous web issues
To: Ehsan Akhgari <[EMAIL PROTECTED]>
Cc: Persian Computing <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain
On Tue, 2004-05-25 at 17:43, Ehsan Akhgari wrote:
Well, maybe you're right, but I don't see how a text editor is supposed to know the encoding of a file without some kind of mark.
Does Latin-1 (an old encoding of text files for Western Europe, also called ISO 8859-1) had a mark to distinguish it from, say, CP1256 (an old MS encoding for Arabic language)? Did ASCII have a mark? No. Text files are text files. They are not supposed to have marks to distinguish their character set.
The character set of a text file should be in the metadata (file name,
file system, environment variable, HTTP header, MIME header, ...) or it
should be auto-detected (UTF-8 is really easy to detect, since it has a
very regular mathematical pattern, UTF-16 is also easy to detect, since
it's recommended that it has a BOM), or it should be specified by the
user when he is opening a file.
Plain text files have no means of
identifying the character encoding,
That is somehow true. Plain text files have *sometimes* no means of
identifying the character encoding *by themselves*.
so a single text file can be interpreted as UTF-7, UTF-8, UTF-16, UTF-32, etc. if there's nothing to declare the exact character encoding used.
UTF-7 is deprecated. UTF-16 and UTF-32 *do* have BOM marks in the
standards defining them, so it's OK if they use a BOM. UTF-8 doesn't
have that. Nor does ASCII, CP1256, Latin-1, etc.
The point here is that, protocols which do not allow BOM are those who
provide other means of specifying the character encoding.
The point is that Notepad doesn't add a mark to Latin-1 or CP1256, why
should it add one to UTF-8?!
A certain byte stream can have multiple interpretations depending on what content encoding you use to interpret it, and there must be some way to cut off this confusion.
Yes, by either Metadata, auto-detection, or specific selection.
roozbeh
------------------------------
Message: 7
Date: Tue, 25 May 2004 10:01:56 -0700 (PDT)
From: C Bobroff <[EMAIL PROTECTED]>
Subject: Re: Mac info for Persian
To: [EMAIL PROTECTED]
Message-ID:
        <[EMAIL PROTECTED]>
Content-Type: TEXT/PLAIN; charset=US-ASCII
Thanks to three Mac users on this list, I was able to collect some basic info on Persian Mac computing here:
http://students.washington.edu/irina/persianword/mac.html
I hope this will fill an information gap for the users as well as provide a place in English for the Apple people to see that they do have a few Persian customers who are suffering from neglect. More feedback is encouraged and I'll update as needed.

I also took the opportunity to throw in a gratuitous, not-so-nice provocation which is aimed not only to excuse the Mac browsers for their shortcomings but also to encourage someone here to make a perfect, model testpage for Persian browsing. Since [most] webmasters want their site to actually work on all computers, we have no testpage. This is what I said:

"It is important to keep in mind that there are almost no real Persian websites one can use as a test for the browser. That is because most webmasters have dumbed down their site to make it work on Win9x and also to compensate for buggy fonts and general lack of complete Persian fonts. Therefore one rarely finds ZWNJ, Hamza above Heh, Persian numbers, small vowels, Persian Yeh, Persian Kaf, etc."
-Connie
------------------------------
Message: 8
Date: Tue, 25 May 2004 16:36:37 -0400
From: Behdad Esfahbod <[EMAIL PROTECTED]>
Subject: RE: Miscellaneous web issues
To: Roozbeh Pournader <[EMAIL PROTECTED]>
Cc: Persian Computing <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: TEXT/PLAIN; charset=US-ASCII
so a single text file can be interpreted as UTF-7, UTF-8, UTF-16, UTF-32, etc. if there's nothing to declare the exact character encoding used.
The whole point of defining UTF-8 this way has been to replace
ASCII transparently.  So if character sets need marks to identify
them, the only one that should not need a mark and should be the
default is UTF-8.
--behdad
  behdad.org
------------------------------
_______________________________________________
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing
End of PersianComputing Digest, Vol 12, Issue 34
************************************************


_______________________________________________
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing

Re: PersianComputing Digest, Vol 12, Issue 34

Reply via email to