RE: Miscellaneous web issues

2004-12-01 Thread Ehsan Akhgari
 Roozbeh, it is a long time and I don't remember your answer to this
 email. What happened to this new dll?

AFAIK, it's not still put in the sourceforge.  If you're interested, I can
mail it to you off-list.

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-06-05 Thread Roozbeh Pournader
On Thu, 2004-06-03 at 21:08, Ehsan Akhgari wrote:
 I did this, and installed the new DLL on my system, and it works beatifully.
 It's the same keyboard layout, only Shift+Space inserts a ZWNJ instead of a
 space.  I thought I would submit it to sourceforge so that everyone can use
 the new tool.  Roozbeh, let me know if it would be okay for me to send the
 files to you to get them into the sourceforge, or if I should do something
 else.

I would appreciate if you send me the exact process you used and the
DLL, so we can publish it on the FarsiWeb website on SourceForge.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-25 Thread Ehsan Akhgari
 What is notepad? A text editor? Text editors should not insert a UTF-8
 BOM either. The problem is that Microsoft sometimes invents
 non-standard things and then pushes it so hard that Unicode adds it to
 parts of the standard (or an FAQ). Microsoft conventions for .txt
 files in the Unicode FAQ looks sarcastic to me.

Well, maybe you're right, but I don't see how a text editor is supposed to
know the encoding of a file without some kind of mark.  See, HTTP transfers
the character set using the Content-Type response header.  In HTML, it's
spedified with a meta http-equiv=Content-Type ... tag.  In XML, the
default encoding is UTF-8, and if a document is encoded in another encoding,
it must be specified in the ?xml ? PI.  Plain text files have no means of
identifying the character encoding, so a single text file can be interpreted
as UTF-7, UTF-8, UTF-16, UTF-32, etc. if there's nothing to declare the
exact character encoding used.

The point here is that, protocols which do not allow BOM are those who
provide other means of specifying the character encoding.  A certain byte
stream can have multiple interpretations depending on what content encoding
you use to interpret it, and there must be some way to cut off this
confusion.

YMMV,
-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-25 Thread Ehsan Akhgari
  Thanks for the links.  Seems like a very handy keyboard.
 BTW, why the
  Shift-Space combination does not work?

 Bug in Microsoft keyboard layout creation tool. Use Shift-B
 temporarily.

Thanks.

I've not done any work in this arena, so what I propose here might make no
sense.  Sorry if that's so.  But, the M$ page on the keyboard layout
creation tool says the tool simplifies the process of creating a keyboard
layout.  Would there be any way to assign ZWNJ to Shift+Space by coding the
keyboard layout tool manually?  If you can send me the C/C++ source file
off-list, I'll try to investigate it further.

If not, I guess Shift+B is not that bad as well.  The keyboard layout rocks,
even without having Shift+Space in place.  :-)

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-25 Thread Roozbeh Pournader
On Tue, 2004-05-25 at 17:43, Ehsan Akhgari wrote:
 Well, maybe you're right, but I don't see how a text editor is supposed to
 know the encoding of a file without some kind of mark. 

Does Latin-1 (an old encoding of text files for Western Europe, also
called ISO 8859-1) had a mark to distinguish it from, say, CP1256 (an
old MS encoding for Arabic language)? Did ASCII have a mark? No. Text
files are text files. They are not supposed to have marks to distinguish
their character set.

The character set of a text file should be in the metadata (file name,
file system, environment variable, HTTP header, MIME header, ...) or it
should be auto-detected (UTF-8 is really easy to detect, since it has a
very regular mathematical pattern, UTF-16 is also easy to detect, since
it's recommended that it has a BOM), or it should be specified by the
user when he is opening a file.

 Plain text files have no means of
 identifying the character encoding,

That is somehow true. Plain text files have *sometimes* no means of
identifying the character encoding *by themselves*.

 so a single text file can be interpreted
 as UTF-7, UTF-8, UTF-16, UTF-32, etc. if there's nothing to declare the
 exact character encoding used.

UTF-7 is deprecated. UTF-16 and UTF-32 *do* have BOM marks in the
standards defining them, so it's OK if they use a BOM. UTF-8 doesn't
have that. Nor does ASCII, CP1256, Latin-1, etc.

 The point here is that, protocols which do not allow BOM are those who
 provide other means of specifying the character encoding.

The point is that Notepad doesn't add a mark to Latin-1 or CP1256, why
should it add one to UTF-8?!

 A certain byte
 stream can have multiple interpretations depending on what content encoding
 you use to interpret it, and there must be some way to cut off this
 confusion.

Yes, by either Metadata, auto-detection, or specific selection.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-24 Thread Roozbeh Pournader
On Tue, 2004-05-18 at 23:13, Ehsan Akhgari wrote:

 and Notepad is not an HTML editor

What is notepad? A text editor? Text editors should not insert a UTF-8
BOM either. The problem is that Microsoft sometimes invents non-standard
things and then pushes it so hard that Unicode adds it to parts of the
standard (or an FAQ). Microsoft conventions for .txt files in the
Unicode FAQ looks sarcastic to me.

roozbeh

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-24 Thread Roozbeh Pournader
On Thu, 2004-05-20 at 01:48, C Bobroff wrote:
 Roozbeh, is it not time to remove the experimental from its name?

No. This has not become a national standard yet. When it becomes a
national standard (and possibly changing a little at the time), we'll
remove experimental from the name.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-24 Thread Roozbeh Pournader
On Thu, 2004-05-20 at 16:07, Ehsan Akhgari wrote:
  You can re-live its creation here in the archives:
  http://lists.sharif.edu/pipermail/persiancomputing/2003-June/0
 00538.html
 [snip]
 
 Thanks for the links.  Seems like a very handy keyboard.  BTW, why the
 Shift-Space combination does not work?

Bug in Microsoft keyboard layout creation tool. Use Shift-B
temporarily.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-20 Thread Ehsan Akhgari
 You can re-live its creation here in the archives:
 http://lists.sharif.edu/pipermail/persiancomputing/2003-June/0
00538.html
[snip]

Thanks for the links.  Seems like a very handy keyboard.  BTW, why the
Shift-Space combination does not work?

 Done! Beautiful!
 I hope the Mozilla users appreciate all this trouble.

 Thanks again for all your help!

You're welcome! :-)

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-20 Thread C Bobroff
On Thu, 20 May 2004, Ehsan Akhgari wrote:

   BTW, why the
 Shift-Space combination does not work?
Because the Microsoft Keyboard Layout Creator
http://www.microsoft.com/globaldev/tools/msklc.mspx
thought the space bar is reserved for only spacing characters.
Roozbeh said he sent MS a list of such bugs. Until they fix that,
shift-b is not bad for ZWNJ.

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-19 Thread C Bobroff
On Wed, 19 May 2004, Ehsan Akhgari wrote:

 Interesting.  Sorry for my ignorance, but is that keyboard available
 publicly?

You can re-live its creation here in the archives:
http://lists.sharif.edu/pipermail/persiancomputing/2003-June/000538.html

And you can download it here:
http://prdownloads.sourceforge.net/farsitools/persiankeyboard.zip?download

A PDF file with the layout is here:
http://lists.sharif.edu/pipermail/persiancomputing/attachments/20030612/2e85a1ad/PersianKL_preview.pdf

I've also repeated the above here if you don't like ZIP files or have some
other problem.
http://students.washington.edu/irina/persianword/kb.htm

Roozbeh, is it not time to remove the experimental from its name?

 Why not?  The \u syntax allows you to represent Unicode characters in
 JavaScript.
Now I know.

 Well, on Mozilla1.2.1 that I tested it on, if you replaces ZWNJ in the
 description of the Tajik array indices with #8204; then it seems to work
 happily.  Try giving it a test.

Done! Beautiful!
I hope the Mozilla users appreciate all this trouble.

Thanks again for all your help!

-Connie

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-18 Thread Ehsan Akhgari
 An important note: what Notepad does here is only acceptable. It's
 not even recommended. HTML 4 clearly doesn't allow a UTF-8 BOM appear
 before the HTML tag. Notepad is supposed to be a text editor. A text
 editor shouldn't insert markup by itself. BTW, ISIRI 6219 strongly
 discourages the use of a BOM in UTF-8 files.

The problem here is that web protocols (HTML for example) don't allow the
BOM, and Notepad is not an HTML editor, so there's nothing to prevent it
from adding the BOM.  Check out:

http://www.unicode.org/faq/utf_bom.html#28

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]

'I generally take life as it comes my way', said Death.



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-18 Thread Ehsan Akhgari
 First of all, thank you very much for all the patient and lengthy
 explanations. Very nice of you to share so many tips!
 (Thanks to the others too who answered on and off list!)

Happy to help!

[snip]
 Now that 2 people have said to change ZWNJ to \u200c, I tried that but
 it didn't work. I don't think I have the right tool.

 I couldn't do it in Notepad because as I said, it's WYSIWYG in Persian
 script so if I do a global replacement and stick \u200c in the middle
 of Persian script, that's obviously not going to work (and I also
 tried it for good measure and it didn't work but there may be many
 reasons it didn't work out using Notepad.)

I don't know what you mean here.  Why it doesn't work in Notepad?  Note that
on Windows XP, you can't type ZWNJ inside the Find/Replace dialog box - you
need to copy/paste it from inside the Notepad text editor window.  Another
reason why not to use Notepad.

 Then, since you recommended Frontpage, I tried that. Earlier, it had
 not even occured to me to attempt to open a .js file in  Frontpage
 (version
 2000.) This time I fooled it by changing the extension from .js to
 .html and so was able to open it in html view where all the unicode
 was in numeric style. I changed all the #8204; to \u200c but now I
 see that also has not worked.

Well, I don't know what the problem is here...

BTW, FrontPage 2003 can open the .js file (using File | Open, or drag and
drop) and render the UTF-8 characters without converting them to numeric
entities just fine.  Don't try putting them in an HTML file.  Don't know
about FrontPage 2000, though.

 I think I'm not going to use Notepad for making bidirectional arrays
 from now on! That is insane to go to such great lengths!

Yeah, it's definitely so.

 Not sure what you have in mind here, but at this point, Ill be glad
 just to make it work with ZWNJ.

In the JS code, try to replace the trailing ZWNJ-raa and ZWNJ-o with nothing
using a regex.

HTH,
-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-17 Thread Roozbeh Pournader
On Mon, 2004-05-17 at 17:44, Ehsan Akhgari wrote:
 Those are the BOM marks for UTF-8.  Notepad injects them under your nose,
 and that's one of the reasons I avoid Notepad.  Frontpage text editor does
 not have that problem.
 
 A small note: what Notepad does here is *correct*, because it can instruct
 other editors about the content encoding of the file.  It just doesn't work
 with web documents, and that's expected, because Notepad has not been
 designed for creating web documents.

An important note: what Notepad does here is only acceptable. It's not
even recommended. HTML 4 clearly doesn't allow a UTF-8 BOM appear before
the HTML tag. Notepad is supposed to be a text editor. A text editor
shouldn't insert markup by itself. BTW, ISIRI 6219 strongly discourages
the use of a BOM in UTF-8 files.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Miscellaneous web issues

2004-05-17 Thread Behnam
On 16-May-04, at 9:16 PM, C Bobroff wrote:
6. I embedded the fonts again.  Looks beautiful on WInXP/IE6 and 
limited
others. I presume it looks terrible on the rest. Still thinking about 
what
to do about that. Behnam, how's the Tajik looking on your Mac?

-Connie
___
Hi Connie,
I almost missed your direct inquiry from me. I just noticed it in the 
reply of Ehsan Akhgari.
Considering I wasn't sure what I was supposed to do when opening that 
page, I take it it's not working as it should. The mouse-over thing 
doesn't work. I have to select the word (double click) to see its 
equivalent in Tajik (or vice versa) but when I select the word, 
everything seems to work okay. The exception is the last word on 
Persian side. It can't find the word. The last word in Tajik side has 
no problem. I guess the major problem is that mouse-over trick doesn't 
work and selecting one by one is rather inconvenient.

I was using Safari (browser) with Panther (OS 10.3.3) on iMac.
I must add it's wonderful what you are doing there. Keep up the good 
work.

Behnam
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Miscellaneous web issues

2004-05-17 Thread Behdad Esfahbod
On Sun, 16 May 2004, C Bobroff wrote:

 2. When viewed on WinXP/Mozilla1.7a, the ZWNJ's completely throw off my
 mouseover javascript program. It can not find words with ZWNJ. And look
 what happens if you mouseover the Tajik eqivalent: it displays the Persian
 word ok but no ZWNJ. This problem not seen with IE. I left out all harakat
 just so it would work in Mozilla (and Macs) so I'm sorry to see this
 new problem.

I've observed a very similar bug that should be the same as what
you explain:  ZWNJ put by JavaScript in UTF-8 format in the page
is completely thrown away.  As a solution, if you replace all
ZWNJs with \u200C in your JavaScript source, it works.

[BTW, your Herat#1 and Herat#2 MP3 files seem silent to my
player.]

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Miscellaneous web issues

2004-05-17 Thread Behdad Esfahbod
On Sun, 16 May 2004, C Bobroff wrote:

 1. When viewed on WinXP/IE6, look what happens when you mouseover the
 Persian words at the end (i.e. left margin) of each line. You also pick up
 the space to the right of the first word in that line. Similarly, if you
 attempt to mouseover the first word in the line and are just a little off
 the word to the right, you unfortunately will pick up the last word in the
 line.  Is this a bug or just my usual crazy coding style? This problem not
 seen with Mozilla. Also not with left to right languages.

Remove all leading and trailing spaces in your spans and it
should work.  BTW, RTL paragraphs are a must.

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing