Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode
AmirBehzad Eslami <[EMAIL PROTECTED]> wrote on 24/11/2005 17:48:29: > Dear list, > > I'm considering programming a simple "Search Engine" for a website, > to find Arabic/Persian data within a MySQL database. > This database contains a huge amount of data, encoded with Unicode(UTF-8). > > > The big deal is to ** reduce the response time ** to end-users. > > My first solution is to create an Index and use the "FULL-TEXT > Searching" method. > > Luckily, MySQL's provides FULL-TEXT Indexing support in MyISAM tables. > But unfortunately, it doesn't support multi-byte charsets (e.g. > Unicode). [1] > Technically, MySQL creates Indexes over words. > A "word'' is any sequence of characters consisting of letters and > numbers [2]. > > Assuming this, I tried to save the records as Unicode Character > References (), but the search failed again :-( > > Any suggestion? > I appreciate any solution to solve this problem. > > Thanks in Advance, > Behzad > > > [1] MySQL Manual -> 6.8.3 Full-text Search TODO > [2] MySQL Manual -> 6.8 MySQL Full-text Search > > > P.S. *** > I use MySQL 4.0 *** I think this is your problem: MySQL does not properly support Unicode until version 4.1. I am successfully using FullText with MySQL 4.1 to sort UTF-8 encoded Japanese text. I see no reason why it should not work for Arabic - if you upgrade. Alec ___ PersianComputing mailing list PersianComputing@lists.sharif.edu http://lists.sharif.edu/mailman/listinfo/persiancomputing
thak goodness
thanks alot for everything Yahoo! Music Unlimited - Access over 1 million songs. Try it free.___ PersianComputing mailing list PersianComputing@lists.sharif.edu http://lists.sharif.edu/mailman/listinfo/persiancomputing
thak goodness
thanks alot for everything Yahoo! DSL Something to write home about. Just $16.99/mo. or less___ PersianComputing mailing list PersianComputing@lists.sharif.edu http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode
[EMAIL PROTECTED] wrote: AmirBehzad Eslami <[EMAIL PROTECTED]> wrote on 24/11/2005 17:48:29: Dear list, I'm considering programming a simple "Search Engine" for a website, to find Arabic/Persian data within a MySQL database. This database contains a huge amount of data, encoded with Unicode(UTF-8). The big deal is to ** reduce the response time ** to end-users. My first solution is to create an Index and use the "FULL-TEXT Searching" method. Luckily, MySQL's provides FULL-TEXT Indexing support in MyISAM tables. But unfortunately, it doesn't support multi-byte charsets (e.g. Unicode). [1] Technically, MySQL creates Indexes over words. A "word'' is any sequence of characters consisting of letters and numbers [2]. Assuming this, I tried to save the records as Unicode Character References (), but the search failed again :-( Any suggestion? I appreciate any solution to solve this problem. Thanks in Advance, Behzad [1] MySQL Manual -> 6.8.3 Full-text Search TODO [2] MySQL Manual -> 6.8 MySQL Full-text Search P.S. *** I use MySQL 4.0 *** I think this is your problem: MySQL does not properly support Unicode until version 4.1. I am successfully using FullText with MySQL 4.1 to sort UTF-8 encoded Japanese text. I see no reason why it should not work for Arabic - if you upgrade. Alec ___ PersianComputing mailing list PersianComputing@lists.sharif.edu http://lists.sharif.edu/mailman/listinfo/persiancomputing But himself solved his problem. with : mysql_query("SET NAMES utf8"); Even 4.0.x ___ PersianComputing mailing list PersianComputing@lists.sharif.edu http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode
Mohsen wrote:> But himself solved his problem. > with : mysql_query("SET NAMES utf8"); > Even 4.0.xWrong. I decided to prepare two different versions for my software: - A MySQL 4.0-friendly version using Romanizing method (Hats off to you, Ehsan) - A MySQL 4.1-compatible version.The code you mentioned belongs to the 2nd version." SET NAMES indicates what is in the SQL statements that the client sends. Thus, SET NAMES 'cp1251' tells the server future incoming messages from this client are in character set cp1251. It also specifies the character set for results that the server sends back to the client. (For example, it indicates what character set column values are if you use a SELECT statement.) "MySQL Manual 4.1 -> 10.3.6. Connection Character Sets and Collations.Kind Regards, Behzad Yahoo! Music Unlimited - Access over 1 million songs. Try it free.___ PersianComputing mailing list PersianComputing@lists.sharif.edu http://lists.sharif.edu/mailman/listinfo/persiancomputing