*** JUUICHIKETAJIN *** ___________________________________________________________________ Get your own FREE Bolt Onebox - FREE voicemail, email, and fax, all in one place - sign up at http://www.bolt.com
----- Original Message ----- From: <[EMAIL PROTECTED]> To: "James Kass" <[EMAIL PROTECTED]> Sent: Friday, May 04, 2001 4:46 PM Subject: Re: Searchable web page ?!! > I don't know about other search engines, but the way > Google seems to handle some charsets seems to make > me think it is a > > U+3070 U+304B ばか is Japanese for fool or idiot. "Vaca" is pronounced about the same and is the Spanish word for "cow". Guess cows aren't very smart. A lot of times on Google, the description for the page found says something like "this page contains characters that can't be displayed in the current character set..." , which is kind of dumb because all they would have to do at Google is make the character set Unicode! > > and if it case folds, why not kana fold? > Are search engines sensitive to characters, or only > byte sequences? I mean, can it tell that -- OK, let's > pick a good one -- U+304D and SJIS-82AB are the same > thing? > > A big problem might be languages like Greek which use > the second half of the possible byte list. > > Are the search engines smart enough to tell an alpha > is an alpha is an alpha? > I don't know very much about the search engines, and I wonder if you meant to send this letter to the Unicode list? With best regards, James Kass.

