*** JUUICHIKETAJIN ***




___________________________________________________________________
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com



----- Original Message -----
From: <[EMAIL PROTECTED]>
To: "James Kass" <[EMAIL PROTECTED]>
Sent: Friday, May 04, 2001 4:46 PM
Subject: Re: Searchable web page ?!!


> I don't know about other search engines, but the way
> Google seems to handle some charsets seems to make
> me think it is a
>
> U+3070 U+304B

ばか is Japanese for fool or idiot.  "Vaca" is pronounced
about the same and is the Spanish word for "cow".  Guess
cows aren't very smart.  A lot of times on Google, the
description for the page found says something like
"this page contains characters that can't be displayed
in the current character set..." , which is kind of dumb
because all they would have to do at Google is make the
character set Unicode!

>
> and if it case folds, why not kana fold?
> Are search engines sensitive to characters, or only
> byte sequences? I mean, can it tell that -- OK, let's
> pick a good one -- U+304D and SJIS-82AB are the same
> thing?
>
> A big problem might be languages like Greek which use
> the second half of the possible byte list.
>
> Are the search engines smart enough to tell an alpha
> is an alpha is an alpha?
>

I don't know very much about the search engines, and I wonder
if you meant to send this letter to the Unicode list?

With best regards,

James Kass.






Reply via email to