Hi, From: [EMAIL PROTECTED] (Denis Barbier) Subject: Re: enable searching East Asian words at search.debian.org Date: Mon, 12 May 2003 13:45:08 +0200
> > For example, I can search an Russian word "Novosti" (of course in > > Cyrillic) > > The point is: how are Cyrillic words passed by the web browser to the > search engine? > Are they encoded in ISO-8859-5, KOI8-R or UTF-8 charsets? UTF-8, i.e., the same encoding as the search page. For example, the previous example: http://search.debian.org/?q=%D0%9D%D0%BE%D0%B2%D0%BE%D1%81%D1%82%D0%B8&ps=10&o=0&m=all&g= The first 6 bytes read: %D0%9D -> U+041D (CYRILLIC CAPITAL LETTER EN) %D0%BE -> U+043E (CYRILLIC SMALL LETTER O) %D0%B2 -> U+0432 (CYRILLIC SMALL LETTER VE) --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/

