> There's a difference in the case of C++ standard and web
> standards:  Writing non-standard C++ code only produces compile-time
> problems, but if you happen to compile the code, it works correctly
> (or supposed to do so).

Well, that's not exactly so.  Some non-conformant behavior tend to generate
(maybe subtle) runtime behavior differences.  But I see what your point here
is.

> But it's quite a different case in web.
> 30-40 percent is low enough to get ignored, counting that the other
> way you are sacrificing the other 60-70% for not being able to find
> the document by searching in Google.  And note that even with Win9x
> and a recent IE, and updated fonts, there's no problem.

I'd definitely do so if the Google search problem couldn't be solved.  But
I've been using a method I've mentioned in my other post to solve that
problem as well.  This was the best way of having the best of the two worlds
that I could think of, but I'm wide open for suggestions/improvements to
this idea.

> About using HTML entities, no matter what the encoding of the page is,
> HTML entities generate Unicode characters.

They do on most browsers, but browsers are not required to do so.  Consider
a browser which can't handle UTF-8 (well, or at all).

> It's quite common to see
> people exporting Persian documents in MS Word, and get an HTML page
> encoded in MS Arabic encoding, with Persian Yeh and Keh encoded in
> HTML entities.

Yes, and that will make their document even more difficult for search
engines to index.  And of course, I'd debate that using CP1256/ISO-8859-6 is
not suitable for Persian documents, but that's another story perhaps.

> PS.  BTW, I just found that using Harakat (kasre, fathe, ...) also
> prevent a hit in Google search :(.  That's quite expected, but perhaps
> I should reconsider my habbit of putting those tiny marks everywhere.

That's another sad fact.  I really think that Google must seriously consider
implementing some such details on their indexing process.  That's also one
of the things that AriaSearch.com handles.

---

Hmmm, now that we're here, how about gathering some volunteers who can work
with Google to fix some of these problems?  In the past, I've contacted
Google on a number of occassions about small problems in their services, and
they seemed quite willing to fix them.  Maybe we would hopefully have a more
Persian-friendly Google in the future this way.

If you feel that this is a good idea, I'd be pleased to take part in that
team.  Comments?

-------------
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



_______________________________________________
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing

Reply via email to