Project to solve Farsi contents problems on the Web

AmirBehzad Eslami Mon, 17 Nov 2003 14:46:27 -0800

In the reply of:
http://lists.sharif.edu/pipermail/farsiweb/2003-November/000845.html (posted by Mr. Connie)
http://lists.sharif.edu/pipermail/farsiweb/2003-November/000844.html (posted by Roozbeh Pournader)
---------------

Dear Roozbeh and Mr. Connie (strange name!),

I'm glad that I found two professionals in the mailing list, who understand the importance of Farsi "Yeh" and "Keheh" problem in the future of Web, and its Farsi content.
-----------------------------------------
I'm a programmer and a webmaster, but why I prefer to use HTML Numeric Character References? Because It is the common way which Web Designers use on their pages. Some persian web designers don't know a bit about Unicode, UTF-8 and U+06CC!
Do you know why? Because they never found a good article about such topics in magazines, books, and even FarsiWeb.info.

But I can refer to that unicode character, using Decimal Character References, or Hexadecimal Character References. What about UTF-8 Literal characters (if possible here)?

Anyway, As you want, I use U+xxxx :-)
-----------------------------------------
How to tell the 'User Agent' (including IE, Google.com's Robot, etc.) that the document is in Farsi language?

There are four well-known methods, You may use all of them in your webpages:

1) Using 'Content-Language' header of HTTP protocol.
[SAMPLE CODE:]
<?php
header("Content-Type: text/html; charset=UTF-8");
header("Content-Language: fa");
?>

2) Using [X]HTML's Meta Tags:

[SAMPLE CODE:]
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Language" content="fa" />

3) The 'lang' attribute at the <html> tag.
[SAMPLE CODE:]
<html lang="fa-IR">

For now, create a simple HTML page with the <html lang="fa">, and open it in Mozilla.
Then right-click on an empty area of your page, and choose the "Properties" item from the context menu. Enjoy how Mozilla detects the language!

4) The xml:lang attribute in XML and XHTML:
[SAMPLE CODE:]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fa-IR">

Please note that the 'lang' attribute deprecated in XHTML 1.1 - More info: http://www.w3.org/TR/xhtml11/changes.html
-----------------------------------------

Ok, now we know how to define the language of the whole document. The 'User Agents' can detect our page's language very easily by the above methods.

-----------------------------------------
Dear Roozbeh, I have thought about a Win9x user's problem, who tries to search for a keyword in a downloaded webpage, before I sent this message to your mailing list.
-----------------------------------------
As you know, Farsi Win9x systems are based in Arabic Windows. Before WinXP, Micro$oft never released an official version of Windows, that really supports the Farsi language. All last versions were Arabic.

So, many software companies in Iran, extended the Micro$oft's Arabic Windows to support "Gach-Pazh" characters, But they never think about the dotted "Yeh", So some users still type "Arabic Yeh" in their Farsi Win9x operating system. I mean, In many cases, A Win9x user is not able to type Farsi "Yeh (U+06cc)". But there are some Farsi versions of Win9x, that support dottless "Yeh".

Oops! What should we do now? How can we detect client-system's capabilities? You know that, we can only get info about "Default User Language", "User's OS", "User Agent's Name", "User Agent's Version".

We must decide now! What should we serve to Win9x users?
Serve them Arabic? Serve them Clean Farsi? Serve them Far-Arabic as uses in BBCPersian.com? What should we serve to Search Engines? How should we save the Data in our DataBases?

Of course, there is no any ultimate way to solve all above problems about Farsi content. We can't find answer of all questions, but we'll try. Any way, we should specify a standard to publish Farsi on the Web; otherwise "Gave-moon Pas-Fardaa Mizaa-E".

As a webmaster, I believe: it's Win9x users problem! They must upgrade their system to be standard, which completely supports Farsi. However, as a Web Usability principle says, We should provide them an alternative; As they're still using Arabic-BASED Windows, we serve them Arabic.

What do you think?

P.S.
Guys, I know there are a lot of problems about Farsi content.
I think It would be better to solve them step-by-step. Thus, I don't talk about "Arabic Text within Farsi webpage" problem and what should search engines do while indexing Farsi pages, etc.

_______________________________________________
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb

Project to solve Farsi contents problems on the Web

Reply via email to