Dear list,
I'm considering programming a simple "Search Engine" for a website,
to find Arabic/Persian data within a MySQL database.
This database contains a huge amount of data, encoded with Unicode (UTF-8).
The big deal is to ** reduce the response time ** to end-users.
My first solution is to create an Index and use the "FULL-TEXT Searching" method.
Luckily, MySQL's provides FULL-TEXT Indexing support in MyISAM tables.
But unfortunately, it doesn't support multi-byte charsets (e.g. Unicode). [1]
Technically, MySQL creates Indexes over words.
A "word'' is any sequence of characters consisting of letters and numbers [2].
Assuming this, I tried to save the records as Unicode Character References (&#xxxx;), but the search failed again :-(
Any suggestion?
I appreciate any solution to solve this problem.
Thanks in Advance,
Behzad
[1] MySQL Manual -> 6.8.3 Full-text Search TODO
[2] MySQL Manual -> 6.8 MySQL Full-text Search
P.S.
I use MySQL 4.0
1) Table Strucutre
CREATE TABLE `articles` (
`article_id` int(10) unsigned NOT NULL auto_increment,
`article_title` NATIONAL varchar(255) NOT NULL default '',
`article_text` text NOT NULL,
PRIMARY KEY (`article_id`),
FULLTEXT (`article_title`,`article_text`)
) TYPE=MyISAM ;
ALTER TABLE `articles` CHARACTER SET ut8;
2) SQL-Query to Perform a Full-text search
SELECT * FROM articles WHERE MATCH(article_title, article_text) AGAINST('سوال')
I'm considering programming a simple "Search Engine" for a website,
to find Arabic/Persian data within a MySQL database.
This database contains a huge amount of data, encoded with Unicode (UTF-8).
The big deal is to ** reduce the response time ** to end-users.
My first solution is to create an Index and use the "FULL-TEXT Searching" method.
Luckily, MySQL's provides FULL-TEXT Indexing support in MyISAM tables.
But unfortunately, it doesn't support multi-byte charsets (e.g. Unicode). [1]
Technically, MySQL creates Indexes over words.
A "word'' is any sequence of characters consisting of letters and numbers [2].
Assuming this, I tried to save the records as Unicode Character References (&#xxxx;), but the search failed again :-(
Any suggestion?
I appreciate any solution to solve this problem.
Thanks in Advance,
Behzad
[1] MySQL Manual -> 6.8.3 Full-text Search TODO
[2] MySQL Manual -> 6.8 MySQL Full-text Search
P.S.
I use MySQL 4.0
1) Table Strucutre
CREATE TABLE `articles` (
`article_id` int(10) unsigned NOT NULL auto_increment,
`article_title` NATIONAL varchar(255) NOT NULL default '',
`article_text` text NOT NULL,
PRIMARY KEY (`article_id`),
FULLTEXT (`article_title`,`article_text`)
) TYPE=MyISAM ;
ALTER TABLE `articles` CHARACTER SET ut8;
2) SQL-Query to Perform a Full-text search
SELECT * FROM articles WHERE MATCH(article_title, article_text) AGAINST('سوال')
Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
_______________________________________________ PersianComputing mailing list PersianComputing@lists.sharif.edu http://lists.sharif.edu/mailman/listinfo/persiancomputing