Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-12-09 Thread AmirBehzad Eslami
Behdad Esfahbod wrote:   That's the tricky part, or where the runtime-hell comes in.  What   I did was to write a small java program based on the samples in   Lucene to connect to my database and feed the data into Lucene.   At search time, I have another little Java program that takes the   query string from command line and prints out search results to   standard output.  My PHP script then just fires up a shell script   that in turn runs the Java program, piping the output into PHP...Knowledge is Power. (Alvin Toffler)That's a very wonderful architecture. It seems that I was blind before  reading your e-mail. I have never thought about "shell" power before,  and using it as an interface to talk with Java. I like your point of  view. Very Interesting!Thank you very much for sharing the source code!Behzad
		Yahoo! Shopping 
Find Great Deals on Holiday Gifts at Yahoo! Shopping ___
PersianComputing mailing list

Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-30 Thread AmirBehzad Eslami
Dear Behdad,On 25 Nov 2005, you wrote:   Another options is to get yourself a real search engine, like Apache Lucene. I've written my experience using that here: always offer the most brilliant solutions!!Unfortunately, I have no experience with this mehotd. But I'm still eager.I read your weblog and met "Apache Lucene" homepage.  I'm impressed. Would you tell ushow you have integrated this Java-driven package with PHP at ?!! It worksreally fast.Thank in advance,  Behzad
		 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.___
PersianComputing mailing list

Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-27 Thread AmirBehzad Eslami
Mohsen wrote: But himself solved his problem.   with : mysql_query("SET NAMES utf8");   Even 4.0.xWrong. I decided to prepare two different versions for my software:  - A MySQL 4.0-friendly version using Romanizing method (Hats off to you, Ehsan)  - A MySQL 4.1-compatible version.The code you mentioned belongs to the 2nd version." SET NAMES indicates what is in the SQL statements that the client  sends. Thus, SET NAMES 'cp1251' tells the server “future incoming  messages from this client are in character set cp1251.” It also  specifies the character set for results that the server sends back to  the client. (For example, it indicates what character set column values  are if you use a SELECT statement.) "MySQL Manual 4.1 - 10.3.6. Connection Character Sets and Collations.Kind Regards,  Behzad  
		 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.___
PersianComputing mailing list

MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-24 Thread AmirBehzad Eslami
Dear list,I'm considering programming a simple "Search Engine" for a website,  to find Arabic/Persian data within a MySQL database.  This database contains a huge amount of data, encoded with Unicode (UTF-8).   The big deal is to ** reduce the response time ** to end-users.My first solution is to create an Index and use the "FULL-TEXT Searching" method.Luckily, MySQL's provides FULL-TEXT Indexing support in MyISAM tables.  But unfortunately, it doesn't support multi-byte charsets (e.g. Unicode). [1]  Technically, MySQL creates Indexes over words.  A "word'' is any sequence of characters consisting of letters and numbers [2].Assuming this, I tried to save the records as Unicode Character References (#;), but the search failed again :-(Any suggestion?  I appreciate any solution to solve this problem.Thanks in Advance,  Behzad     
 [1] MySQL Manual - 6.8.3 Full-text Search TODO  [2] MySQL Manual - 6.8 MySQL Full-text Search  P.S.I use MySQL 4.01) Table StrucutreCREATE TABLE `articles` (   `article_id` int(10) unsigned NOT NULL auto_increment,   `article_title` NATIONAL varchar(255) NOT NULL default '',   `article_text` text NOT NULL,   PRIMARY KEY (`article_id`),   FULLTEXT (`article_title`,`article_text`)  ) TYPE=MyISAM ;ALTER TABLE `articles` CHARACTER SET ut8;2) SQL-Query to Perform a Full-text searchSELECT * FROM articles WHERE MATCH(article_title, article_text) AGAINST('')
		 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.___
PersianComputing mailing list

[persiancomputing] Thesis: Web Content Mining - Problems of Persian Websites

2005-06-04 Thread AmirBehzad Eslami

I'm still around ;-)
How's every body?

I found something on the net, which I guess there are many people in the list, who might be 
interested in the subject.

Thesis: Web Content Mining - Problems of Persian 


PDF (Full 
1,896 Kbytes

PersianComputing mailing list

Re: WEFT webpage font embedding--Call for feedback

2004-05-07 Thread AmirBehzad Eslami
Dear Connie,

Like you, I use WinXP and IE 6.0. I'm sorry to say that I can't help you on
the other platforms.

But take a look at , which provides good services
for web page testing on mutiliple platforms and different browsers.
Hope this helps :-)

Please inform us about the result.


- Original Message -
Sent: Friday, May 07, 2004 11:01 AM
Subject: WEFT webpage font embedding--Call for feedback

 We've had a few discussions about WEFT before in the past but never really
 explored it completely.  Therefore, I made this demo page in both
 English and Persian and embedded Tahoma, Koodak(by FarsiWeb) and Arabic

 Can you please check if Weft has worked? Do you see my fonts correctly?
 Is the Yeh (medial form) showing up correctly in all fonts, especially on
 Win98? Is the load time any longer than usual? If you have the old, buggy
 Tahoma font, is my corrected font showing up instead?  If you have the old
 Sinasoft or Borna Koodak, is my FarsiWeb Koodak showing up?

 Please report your findings! Be sure to mention which version of Windows
 and IE. By the way, you have to uninstall these fonts if you have them,
 otherwise, the test is not too helpful :)

 As you may know, Weft only works on Windows and IE so don't bother to
 check on anything else.  Also please don't look at the source code! I was
 in a great hurry and yes, it's a mess.  Anyone who is qualified is welcome
 to redo it if too unbearable.  I would appreciate that!

 PersianComputing mailing list

PersianComputing mailing list

Using of U+066C as a number-separator

2004-01-08 Thread AmirBehzad Eslami

Thanks to Connie who convinced me.
It seems using of U+066C is the best 

But don't you think shape of U+066C is very 
similar to sign of 'foot' and 'minute'? (


PersianComputing mailing list