RE: Miscellaneous web issues

2004-12-01 Thread Ehsan Akhgari
> Roozbeh, it is a long time and I don't remember your answer to this
> email. What happened to this new dll?

AFAIK, it's not still put in the sourceforge.  If you're interested, I can
mail it to you off-list.

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Miscellaneous web issues

2004-12-01 Thread Ali A. Khanban
Ehsan Akhgari wrote:
I would appreciate if you send me the exact process you used and the
DLL, so we can publish it on the FarsiWeb website on SourceForge.
   

OK.  I send the step-by-step process on the list, and will send you the
relevant files off-list, so that you can put them on sourceforge.
...
I'm sending to Roozbeh two files: Persian-src-1_0_3_14.zip which contains
the modified source files, and Persian-1_0_3_14.zip which contains the DLL
plus the installer, which I guess he'd make available through the
sourceforge.
 

Roozbeh, it is a long time and I don't remember your answer to this 
email. What happened to this new dll?

Best
-ali-
--

||   Ali Asghar Khanban
|| ||Research Associate in Department of Computing
|||  Imperial College London, London SW7 2BZ, U.K.
||   Tel: +44 (020) 7594 8241 Fax: +1 (509) 694 0599
|||  [EMAIL PROTECTED]   http://www.doc.ic.ac.uk/~khanban

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-06-08 Thread Ehsan Akhgari
> I would appreciate if you send me the exact process you used and the
> DLL, so we can publish it on the FarsiWeb website on SourceForge.

OK.  I send the step-by-step process on the list, and will send you the
relevant files off-list, so that you can put them on sourceforge.

Here are the steps I took to accomplish the job:

1.  After installing the Microsoft Keyboard Layout Creator (MSKLC) tool, I
inspected its install directory, and figured that it's being shipped with a
version of the MS C/C++ compiler (cl.exe) in the directory: C:\Program
Files\Microsoft Keyboard Layout Creator\bin\i386.  This assured me that the
tool creates a C source file, and feeds that to the compiler to create the
layout DLL.

Now, I needed to know the location of the generated source file, and also
the command prompt parameters passed to the compiler.

2.  To get the command prompt options passed to the compiler, I wrote a
simple application which appends its command line arguments to a log.txt
file.  This application is called shim.cpp, and is shipped  in the src
package inside the shim directory.  It can simply be compiled to shim.exe
using the command "cl shim.cpp".

3.  Now, I moved all of the .exe files in the C:\Program Files\Microsoft
Keyboard Layout Creator\bin\i386 directory, and copied shim.exe under all of
the moved files' names.  So, now I had a cl.exe, rc.exe, link.exe, etc. in
that directory which were all actually the shim.exe program.  This enabled
me to figure the command prompt options passed to the compiler tools from
the MSKLC tool so that I could immitate them manually.

4.  I opened MSKLC, and selected File | Load Existing Keyboard menu item to
load the "Persian experimental standard" keyboard (version 1.0.3.13) that I
had already grabbed from sf.net repository.

5.  I selected the Project | Build DLL and Setup Package menu item to build
the DLL.  The tool invoked my shim tool instead of all of the compiler's
tools (see Step 3 above.)

6.  I created the directory C:\Program Files\Microsoft Keyboard Layout
Creator\hack, and created a build.bat file there, which would execute the
compiler's tools with the command prompts passed by MSKLC to it.

7.  I copied the keyboard layout source files generated by MSKLC from the
temporary directory to the hack directory as well.

8.  I edited Persian.c, to change the shift state code for the Space key
from ' ' to 0x200C.  The patched line is line 268 in the original file
copied from the temp directory.

9.  I edited Persian.rc to change the version number from 1.0.3.13 to
1.0.3.14 so that I could tell my modified Persian.dll version from the
original FarsiWeb one.

10.  I ran build.bat, and voila!  The Persian.dll version 1.0.3.14 got
built.  Then I just had to replace it with the version 1.0.3.13 DLL from the
original FarsiWeb package.  The installer didn't need any change.  Now, I
just ran the installer to uninstall the old version, and install the new
version, and I had my keyboard working with Shift+Space.

I'm sending to Roozbeh two files: Persian-src-1_0_3_14.zip which contains
the modified source files, and Persian-1_0_3_14.zip which contains the DLL
plus the installer, which I guess he'd make available through the
sourceforge.

I'm open for questions/comments.  Please don't hesitate if you have any.

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-06-05 Thread Roozbeh Pournader
On Thu, 2004-06-03 at 21:08, Ehsan Akhgari wrote:
> I did this, and installed the new DLL on my system, and it works beatifully.
> It's the same keyboard layout, only Shift+Space inserts a ZWNJ instead of a
> space.  I thought I would submit it to sourceforge so that everyone can use
> the new tool.  Roozbeh, let me know if it would be okay for me to send the
> files to you to get them into the sourceforge, or if I should do something
> else.

I would appreciate if you send me the exact process you used and the
DLL, so we can publish it on the FarsiWeb website on SourceForge.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-06-03 Thread Ehsan Akhgari
> There is no C/C++ source file. The source is a data file that MSKLC
> compiles into the DLL. If the data file contains ZWNJ on shift-space,
> it fails to compile. Microsoft developers confirmed that this is a
> bug.

Well, I did a little bit investigation on this.

I downloaded the MSKLC (MS Keyboard Layout Creator) tool, and took a look at
it.  This tool generates a C source code from the data you feed to it, and
then compiles this C code in order to generate the keyboard layout DLL.  The
bug which expects Space to only insert a space character is at the MSKLC
level.  IOW, if the generated C source code is patched correctly, and then
compiled with the same compiler switches that the MSKLC tool passes to the
compiler, ZWNJ can be successfully assigned to Shift+Space combination.

I did this, and installed the new DLL on my system, and it works beatifully.
It's the same keyboard layout, only Shift+Space inserts a ZWNJ instead of a
space.  I thought I would submit it to sourceforge so that everyone can use
the new tool.  Roozbeh, let me know if it would be okay for me to send the
files to you to get them into the sourceforge, or if I should do something
else.


-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-25 Thread Behdad Esfahbod

> > so a single text file can be interpreted
> > as UTF-7, UTF-8, UTF-16, UTF-32, etc. if there's nothing to declare the
> > exact character encoding used.

The whole point of defining UTF-8 this way has been to replace
ASCII transparently.  So if character sets need marks to identify
them, the only one that should not need a mark and should be the
default is UTF-8.


--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-25 Thread Roozbeh Pournader
On Tue, 2004-05-25 at 17:43, Ehsan Akhgari wrote:
> Well, maybe you're right, but I don't see how a text editor is supposed to
> know the encoding of a file without some kind of mark. 

Does Latin-1 (an old encoding of text files for Western Europe, also
called ISO 8859-1) had a mark to distinguish it from, say, CP1256 (an
old MS encoding for Arabic language)? Did ASCII have a mark? No. Text
files are text files. They are not supposed to have marks to distinguish
their character set.

The character set of a text file should be in the metadata (file name,
file system, environment variable, HTTP header, MIME header, ...) or it
should be auto-detected (UTF-8 is really easy to detect, since it has a
very regular mathematical pattern, UTF-16 is also easy to detect, since
it's recommended that it has a BOM), or it should be specified by the
user when he is opening a file.

> Plain text files have no means of
> identifying the character encoding,

That is somehow true. Plain text files have *sometimes* no means of
identifying the character encoding *by themselves*.

> so a single text file can be interpreted
> as UTF-7, UTF-8, UTF-16, UTF-32, etc. if there's nothing to declare the
> exact character encoding used.

UTF-7 is deprecated. UTF-16 and UTF-32 *do* have BOM marks in the
standards defining them, so it's OK if they use a BOM. UTF-8 doesn't
have that. Nor does ASCII, CP1256, Latin-1, etc.

> The point here is that, protocols which do not allow BOM are those who
> provide other means of specifying the character encoding.

The point is that Notepad doesn't add a mark to Latin-1 or CP1256, why
should it add one to UTF-8?!

> A certain byte
> stream can have multiple interpretations depending on what content encoding
> you use to interpret it, and there must be some way to cut off this
> confusion.

Yes, by either Metadata, auto-detection, or specific selection.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-25 Thread Roozbeh Pournader
On Tue, 2004-05-25 at 17:43, Ehsan Akhgari wrote:
> Would there be any way to assign ZWNJ to Shift+Space by coding the
> keyboard layout tool manually?  If you can send me the C/C++ source file
> off-list, I'll try to investigate it further.

There is no C/C++ source file. The source is a data file that MSKLC
compiles into the DLL. If the data file contains ZWNJ on shift-space, it
fails to compile. Microsoft developers confirmed that this is a bug.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-25 Thread Ehsan Akhgari
> > Thanks for the links.  Seems like a very handy keyboard.
> BTW, why the
> > Shift-Space combination does not work?
>
> Bug in Microsoft keyboard layout creation tool. Use "Shift-B"
> temporarily.

Thanks.

I've not done any work in this arena, so what I propose here might make no
sense.  Sorry if that's so.  But, the M$ page on the keyboard layout
creation tool says the tool "simplifies" the process of creating a keyboard
layout.  Would there be any way to assign ZWNJ to Shift+Space by coding the
keyboard layout tool manually?  If you can send me the C/C++ source file
off-list, I'll try to investigate it further.

If not, I guess Shift+B is not that bad as well.  The keyboard layout rocks,
even without having Shift+Space in place.  :-)

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-25 Thread Ehsan Akhgari
> What is notepad? A text editor? Text editors should not insert a UTF-8
> BOM either. The problem is that Microsoft sometimes invents
> non-standard things and then pushes it so hard that Unicode adds it to
> parts of the standard (or an FAQ). "Microsoft conventions for .txt
> files" in the Unicode FAQ looks sarcastic to me.

Well, maybe you're right, but I don't see how a text editor is supposed to
know the encoding of a file without some kind of mark.  See, HTTP transfers
the character set using the Content-Type response header.  In HTML, it's
spedified with a  tag.  In XML, the
default encoding is UTF-8, and if a document is encoded in another encoding,
it must be specified in the  PI.  Plain text files have no means of
identifying the character encoding, so a single text file can be interpreted
as UTF-7, UTF-8, UTF-16, UTF-32, etc. if there's nothing to declare the
exact character encoding used.

The point here is that, protocols which do not allow BOM are those who
provide other means of specifying the character encoding.  A certain byte
stream can have multiple interpretations depending on what content encoding
you use to interpret it, and there must be some way to cut off this
confusion.

YMMV,
-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-24 Thread Behdad Esfahbod
On Mon, 24 May 2004, Roozbeh Pournader wrote:

> On Tue, 2004-05-18 at 23:13, Ehsan Akhgari wrote:
>
> > and Notepad is not an HTML editor
>
> What is notepad? A text editor? Text editors should not insert a UTF-8
> BOM either. The problem is that Microsoft sometimes invents non-standard
> things and then pushes it so hard that Unicode adds it to parts of the
> standard (or an FAQ). "Microsoft conventions for .txt files" in the
> Unicode FAQ looks sarcastic to me.

You know, Microsoft needed that little trick for the transition
from legacy character sets to UTF-8.

> roozbeh

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-24 Thread Roozbeh Pournader
On Thu, 2004-05-20 at 16:07, Ehsan Akhgari wrote:
> > You can re-live its creation here in the archives:
> > http://lists.sharif.edu/pipermail/persiancomputing/2003-June/0
> 00538.html
> [snip]
> 
> Thanks for the links.  Seems like a very handy keyboard.  BTW, why the
> Shift-Space combination does not work?

Bug in Microsoft keyboard layout creation tool. Use "Shift-B"
temporarily.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-24 Thread Roozbeh Pournader
On Thu, 2004-05-20 at 01:48, C Bobroff wrote:
> Roozbeh, is it not time to remove the "experimental" from its name?

No. This has not become a national standard yet. When it becomes a
national standard (and possibly changing a little at the time), we'll
remove experimental from the name.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-24 Thread Roozbeh Pournader
On Tue, 2004-05-18 at 23:13, Ehsan Akhgari wrote:

> and Notepad is not an HTML editor

What is notepad? A text editor? Text editors should not insert a UTF-8
BOM either. The problem is that Microsoft sometimes invents non-standard
things and then pushes it so hard that Unicode adds it to parts of the
standard (or an FAQ). "Microsoft conventions for .txt files" in the
Unicode FAQ looks sarcastic to me.

roozbeh

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-20 Thread C Bobroff
On Thu, 20 May 2004, Ehsan Akhgari wrote:

>   BTW, why the
> Shift-Space combination does not work?
Because the Microsoft Keyboard Layout Creator
http://www.microsoft.com/globaldev/tools/msklc.mspx
thought the space bar is reserved for only spacing characters.
Roozbeh said he sent MS a list of such bugs. Until they fix that,
shift-b is not bad for ZWNJ.

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-20 Thread Ehsan Akhgari
> You can re-live its creation here in the archives:
> http://lists.sharif.edu/pipermail/persiancomputing/2003-June/0
00538.html
[snip]

Thanks for the links.  Seems like a very handy keyboard.  BTW, why the
Shift-Space combination does not work?

> Done! Beautiful!
> I hope the Mozilla users appreciate all this trouble.
>
> Thanks again for all your help!

You're welcome! :-)

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-19 Thread C Bobroff
On Wed, 19 May 2004, Ehsan Akhgari wrote:

> Interesting.  Sorry for my ignorance, but is that keyboard available
> publicly?

You can re-live its creation here in the archives:
http://lists.sharif.edu/pipermail/persiancomputing/2003-June/000538.html

And you can download it here:
http://prdownloads.sourceforge.net/farsitools/persiankeyboard.zip?download

A PDF file with the layout is here:
http://lists.sharif.edu/pipermail/persiancomputing/attachments/20030612/2e85a1ad/PersianKL_preview.pdf

I've also repeated the above here if you don't like ZIP files or have some
other problem.
http://students.washington.edu/irina/persianword/kb.htm

Roozbeh, is it not time to remove the "experimental" from its name?

> Why not?  The \u syntax allows you to represent Unicode characters in
> JavaScript.
Now I know.

> Well, on Mozilla1.2.1 that I tested it on, if you replaces ZWNJ in the
> description of the Tajik array indices with ‌ then it seems to work
> happily.  Try giving it a test.

Done! Beautiful!
I hope the Mozilla users appreciate all this trouble.

Thanks again for all your help!

-Connie

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-19 Thread Ehsan Akhgari
> It appears taking a break is the best cure. Some progress:

Yes.  It certainly is.  Good to hear the problem's solved.

[snip]
> Find/Replace  [the invisible] ZWNJ in Notepad is no problem becuase I
> have the Persian Experimental Keyboard and ZWNJ is right on Shift-b.
> Although I can't actually SEE that I've typed ZWNJ in the Find box, it
> really is there. So now in my .js array, I have a few Persian words
> with \u200c right in the middle of the Persian script.

Interesting.  Sorry for my ignorance, but is that keyboard available
publicly?

> It doesn't seem like the browsers should be able to handle that but
> now I see it's not a problem.

Why not?  The \u syntax allows you to represent Unicode characters in
JavaScript.

> Only thing I have to
> remember is to re-open the Notepad file in a non-WYSIWYG editor and
> delete that BOM creature.
>
> Mozilla is now able to "find" my words containing ZWNJ which was the
> whole point of this exercise.
>
> One small problem still remains: in Mozilla, if you click on any Tajik
> word, it shows you the Persian counterpart in the popup.
> But Mozilla is not able to display the ZWNJ so that is ignored.
> I'm not sure what to do to solve this.

Well, on Mozilla1.2.1 that I tested it on, if you replaces ZWNJ in the
description of the Tajik array indices with ‌ then it seems to work
happily.  Try giving it a test.

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-18 Thread C Bobroff
It appears taking a break is the best cure. Some progress:

On Tue, 18 May 2004, Ehsan Akhgari wrote:

>  Why it doesn't work in Notepad?

You're right. It DOES work in Notepad and it had worked the very first
time I'd try to replace ZWNJ with \u200c. The reason I didn't know it had
been a success is really anti-climactic: Although I'd cleared my cache on
IE, I had not checked the little optional box for "Delete all offline
files." Only because some others have been emailing feedback concerning
the content did I realize that the page was operational on other
computers. The old .js file must fall into the "offline file" category
because that did the trick. Also, although my eyes were telling me it had
worked, I'd assumed the Find/Replace process had deposited some invisible
junk characters screwing up the direction. An imagined problem!

>Note that
> on Windows XP, you can't type ZWNJ inside the Find/Replace dialog box - you
> need to copy/paste it from inside the Notepad text editor window.  Another
> reason why not to use Notepad.

Find/Replace  [the invisible] ZWNJ in Notepad is no problem becuase
I have the Persian Experimental Keyboard and ZWNJ is
right on Shift-b. Although I can't actually SEE that I've typed ZWNJ
in the Find box, it really is there. So now in my .js array, I have a few
Persian words with \u200c right in the middle of the Persian script.
It doesn't seem like the browsers should be able to handle that but
now I see it's not a problem. Only thing I have to remember is to re-open
the Notepad file in a non-WYSIWYG editor and delete that BOM creature.

Mozilla is now able to "find" my words containing ZWNJ which
was the whole point of this exercise.

One small problem still remains: in Mozilla, if you click on any Tajik
word, it shows you the Persian counterpart in the popup.
But Mozilla is not able to display the ZWNJ so that is ignored.
I'm not sure what to do to solve this.

> BTW, FrontPage 2003 can open the .js file (using File | Open, or drag and
> drop) and render the UTF-8 characters without converting them to numeric
> entities just fine.

ok. That's definitely a 2003 improvement.

> In the JS code, try to replace the trailing ZWNJ-raa and ZWNJ-o with nothing
> using a regex.
I'll look into this possibility.

>
> HTH,
Most definitely. Thanks!
-Connie

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-18 Thread Ehsan Akhgari
> First of all, thank you very much for all the patient and lengthy
> explanations. Very nice of you to share so many tips!
> (Thanks to the others too who answered on and off list!)

Happy to help!

[snip]
> Now that 2 people have said to change ZWNJ to \u200c, I tried that but
> it didn't work. I don't think I have the right tool.
>
> I couldn't do it in Notepad because as I said, it's WYSIWYG in Persian
> script so if I do a global replacement and stick \u200c in the middle
> of Persian script, that's obviously not going to work (and I also
> tried it for good measure and it didn't work but there may be many
> reasons it didn't work out using Notepad.)

I don't know what you mean here.  Why it doesn't work in Notepad?  Note that
on Windows XP, you can't type ZWNJ inside the Find/Replace dialog box - you
need to copy/paste it from inside the Notepad text editor window.  Another
reason why not to use Notepad.

> Then, since you recommended Frontpage, I tried that. Earlier, it had
> not even occured to me to attempt to open a .js file in  Frontpage
> (version
> 2000.) This time I fooled it by changing the extension from .js to
> .html and so was able to open it in html view where all the unicode
> was in numeric style. I changed all the ‌ to \u200c but now I
> see that also has not worked.

Well, I don't know what the problem is here...

BTW, FrontPage 2003 can open the .js file (using File | Open, or drag and
drop) and render the UTF-8 characters without converting them to numeric
entities just fine.  Don't try putting them in an HTML file.  Don't know
about FrontPage 2000, though.

> I think I'm not going to use Notepad for making bidirectional arrays
> from now on! That is insane to go to such great lengths!

Yeah, it's definitely so.

> Not sure what you have in mind here, but at this point, I"ll be glad
> just to make it work with ZWNJ.

In the JS code, try to replace the trailing ZWNJ-raa and ZWNJ-o with nothing
using a regex.

HTH,
-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-18 Thread Ehsan Akhgari
> An important note: what Notepad does here is only "acceptable". It's
> not even recommended. HTML 4 clearly doesn't allow a UTF-8 BOM appear
> before the HTML tag. Notepad is supposed to be a text editor. A text
> editor shouldn't insert markup by itself. BTW, ISIRI 6219 strongly
> discourages the use of a BOM in UTF-8 files.

The problem here is that web protocols (HTML for example) don't allow the
BOM, and Notepad is not an HTML editor, so there's nothing to prevent it
from adding the BOM.  Check out:

http://www.unicode.org/faq/utf_bom.html#28

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]

'I generally take life as it comes my way', said Death.



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-17 Thread C Bobroff
On Mon, 17 May 2004, Ehsan Akhgari wrote:

First of all, thank you very much for all the patient and lengthy
explanations. Very nice of you to share so many tips!
(Thanks to the others too who answered on and off list!)

> parent block tag's direction is ltr.  If you apply the direction to a block
> element (such as , , etc.) then this problem would be solved.

Changing  to  solved the problem in 5 seconds. Easy!

> Yeah, I saw this behavior on WinXP/Mozilla1.2.1.  It seems like Mozilla
> doesn't like the the UTF-8 encoded ZWNJ characters.  I solved half of the
> problem by replacing ZWNJ with ‌

Now that 2 people have said to change ZWNJ to \u200c, I tried that but it
didn't work. I don't think I have the right tool.

I couldn't do it in Notepad because as I said, it's WYSIWYG in Persian
script so if I do a global replacement and stick \u200c in the middle
of Persian script, that's obviously not going to work (and I also tried it
for good measure and it didn't work but there may be many reasons it
didn't work out using Notepad.)

Then, since you recommended Frontpage, I tried that. Earlier, it had not
even occured to me to attempt to open a .js file in  Frontpage (version
2000.) This time I fooled it by changing the extension from .js to .html
and so was able to open it in html view where all the unicode was in
numeric style. I changed all the ‌ to \u200c but now I see that also
has not worked.

Guess it's time to take a break!

> 3.  If you want to insert text in the middle of a block, never go to that
> location by a mouse click.  You may end up inserting the new text in the
> wrong place.  What I do is go to the beginning of the line (or somewhere in
> the English parts of text, and move myself using the arrow keys on the
> keyboard.
I think I'm not going to use Notepad for making bidirectional arrays
from now on! That is insane to go to such great lengths!

> 4.  Never leave Word Wrap on.  Notepad has known problems with it when you
> try to save the file.
I didn't know this at all.


> Those are the BOM marks for UTF-8.
Ok, so the little critters have a name and purpose!

> You can leave them as they are, and handle them in the JavaScript code (trim
> them off of the end of the Tajik words maybe.)

Not sure what you have in mind here, but at this point, I"ll be glad just
to make it work with ZWNJ.


> A big (IMO) problem with font embedding is that if users save the document
> on their HD (using IE of course) then the fonts will be gone.  Not a
> professional image, if you ask me.

ok! Another reason against it! I'm keeping track of these points...

Again, thanks for your helpful responses.

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Miscellaneous web issues

2004-05-17 Thread Behdad Esfahbod
On Sun, 16 May 2004, C Bobroff wrote:

> 1. When viewed on WinXP/IE6, look what happens when you mouseover the
> Persian words at the end (i.e. left margin) of each line. You also pick up
> the space to the right of the first word in that line. Similarly, if you
> attempt to mouseover the first word in the line and are just a little off
> the word to the right, you unfortunately will pick up the last word in the
> line.  Is this a bug or just my usual crazy coding style? This problem not
> seen with Mozilla. Also not with left to right languages.

Remove all leading and trailing spaces in your spans and it
should work.  BTW, RTL paragraphs are a must.

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Miscellaneous web issues

2004-05-17 Thread Behdad Esfahbod
On Sun, 16 May 2004, C Bobroff wrote:

> 2. When viewed on WinXP/Mozilla1.7a, the ZWNJ's completely throw off my
> mouseover javascript program. It "can not find" words with ZWNJ. And look
> what happens if you mouseover the Tajik eqivalent: it displays the Persian
> word ok but no ZWNJ. This problem not seen with IE. I left out all harakat
> just so it would work in Mozilla (and Macs) so I'm sorry to see this
> new problem.

I've observed a very similar bug that should be the same as what
you explain:  ZWNJ put by JavaScript in UTF-8 format in the page
is completely thrown away.  As a solution, if you replace all
ZWNJs with \u200C in your JavaScript source, it works.

[BTW, your Herat#1 and Herat#2 MP3 files seem silent to my
player.]

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Miscellaneous web issues

2004-05-17 Thread Behnam
On 16-May-04, at 9:16 PM, C Bobroff wrote:
6. I embedded the fonts again.  Looks beautiful on WInXP/IE6 and 
limited
others. I presume it looks terrible on the rest. Still thinking about 
what
to do about that. Behnam, how's the Tajik looking on your Mac?

-Connie
___
Hi Connie,
I almost missed your direct inquiry from me. I just noticed it in the 
reply of Ehsan Akhgari.
Considering I wasn't sure what I was supposed to do when opening that 
page, I take it it's not working as it should. The mouse-over thing 
doesn't work. I have to select the word (double click) to see its 
equivalent in Tajik (or vice versa) but when I select the word, 
everything seems to work okay. The exception is the last word on 
Persian side. It can't find the word. The last word in Tajik side has 
no problem. I guess the major problem is that mouse-over trick doesn't 
work and selecting one by one is rather inconvenient.

I was using Safari (browser) with Panther (OS 10.3.3) on iMac.
I must add it's wonderful what you are doing there. Keep up the good 
work.

Behnam
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-17 Thread Roozbeh Pournader
On Mon, 2004-05-17 at 17:44, Ehsan Akhgari wrote:
> Those are the BOM marks for UTF-8.  Notepad injects them under your nose,
> and that's one of the reasons I avoid Notepad.  Frontpage text editor does
> not have that problem.
> 
> A small note: what Notepad does here is *correct*, because it can instruct
> other editors about the content encoding of the file.  It just doesn't work
> with web documents, and that's expected, because Notepad has not been
> designed for creating web documents.

An important note: what Notepad does here is only "acceptable". It's not
even recommended. HTML 4 clearly doesn't allow a UTF-8 BOM appear before
the HTML tag. Notepad is supposed to be a text editor. A text editor
shouldn't insert markup by itself. BTW, ISIRI 6219 strongly discourages
the use of a BOM in UTF-8 files.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Miscellaneous web issues

2004-05-17 Thread Ehsan Akhgari
Hi Connie,

> 1. When viewed on WinXP/IE6, look what happens when you mouseover the
> Persian words at the end (i.e. left margin) of each line. You also
> pick up the space to the right of the first word in that line.
> Similarly, if you attempt to mouseover the first word in the line and
> are just a little off the word to the right, you unfortunately will
> pick up the last word in the line.  Is this a bug or just my usual
> crazy coding style? This problem not seen with Mozilla. Also not with
> left to right languages.

The problem is because you're applying the right-to-left direction on a
.   is an inline element, and if you make it rtl, IE assumes
that you're having some RTL text in the middle of a LTR block, because its
parent block tag's direction is ltr.  If you apply the direction to a block
element (such as , , etc.) then this problem would be solved.

> 2. When viewed on WinXP/Mozilla1.7a, the ZWNJ's completely throw off
> my mouseover javascript program. It "can not find"
> words with ZWNJ. And look what happens if you mouseover the Tajik
> eqivalent: it displays the Persian word ok but no ZWNJ.
> This problem not seen with IE. I left out all harakat just so it would
> work in Mozilla (and Macs) so I'm sorry to see this new problem.

Yeah, I saw this behavior on WinXP/Mozilla1.2.1.  It seems like Mozilla
doesn't like the the UTF-8 encoded ZWNJ characters.  I solved half of the
problem by replacing ZWNJ with ‌ in the definitions for the Tajik
words.  I tried the same for other ZWNJ's, but it doesn't still work (at
least, the script doesn't trim off the ZWNJ from the word.)  Maybe you can
try to fully encode your Unicode text using numerical entities to see if
that works.  However, that would cuase a maintenance nightmare for you
without an HTML editor.

> 3. To make the javascript arrays, I had to put a Persian word running
> from right-to-left inside brackets [ ] running from left to right and
> this I did in MS Notepad. Somehow, whether I copied or pasted or if I
> switched language in the process some sort of invisible characters
> would be added or else the brakets would end up like this: [ [ with
> the 2nd one running in the opposite direction.  I had to keep re-doing
> and re-doing this, almost going crazy in the process. Maybe if
> brackets [ ] were in the Persian font, it would have been easier. I
> don't know. Punctuation in bidirectional situations is troublesome.

This is only a visual problem in my experience.  The key to successful
coding the Persian/English mixture in Notepad is keeping the following in
mind:

1.  Only switch to Persian keyboard when you want to type in Persian.  Type
all the non-alphabet characters using the English keyboard (including
numbers).
2.  Ignore what you see on the screen as much as you can.  If you see weird
stuff, sometimes saving your work and re-opening Notepad helps.
3.  If you want to insert text in the middle of a block, never go to that
location by a mouse click.  You may end up inserting the new text in the
wrong place.  What I do is go to the beginning of the line (or somewhere in
the English parts of text, and move myself using the arrow keys on the
keyboard.
4.  Never leave Word Wrap on.  Notepad has known problems with it when you
try to save the file.

But Notepad is a toy IMO.  Before you start losing your hair too quickly,
get yourself a real editor.  My favorite is the Frontpage text editor (I
never use its WYSIWYG, but the multi-lingual support in the editor is
decent.)  I use Frontpage 2003.  DreamWeaver's text editor is not bad as
well, but I have had problems with it editing Persian text.

> 4. Notepad also deposits 3 junk characters at the top, an i with 2
> dots, 2 right-angle brackets and an upside-down question mark, however
> you can't
> *see* them while in Notepad so you have to open the file in another
> text program to delete them.  These 3 junk characters prevent the
> webpage from working in certain browers.  Notepad is otherwise great
> because the latest one is WYSIWYG and makes Persian data-entry easy
> and these 3 junk chars are a small price to pay for that luxury.
> However, I'm open to a better tool.

Those are the BOM marks for UTF-8.  Notepad injects them under your nose,
and that's one of the reasons I avoid Notepad.  Frontpage text editor does
not have that problem.

A small note: what Notepad does here is *correct*, because it can instruct
other editors about the content encoding of the file.  It just doesn't work
with web documents, and that's expected, because Notepad has not been
designed for creating web documents.

> 5.  Since "-raa" and "-o" are considered 2 separate words in Persian
> script but hook up to the previous word in Tajik script, I had to
> employ the ZWNJ just to have a one-to-one correspondence between
> languages for the purposes of this project. I was wishing I had
> Behdad's beloved U+202F, the Narrow No-Break Space for this operation!

You can leave them as they are, and handle them in