Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-12-09 Thread AmirBehzad Eslami
Behdad Esfahbod wrote:   That's the tricky part, or where the runtime-hell comes in.  What   I did was to write a small java program based on the samples in   Lucene to connect to my database and feed the data into Lucene.   At search time, I have another little Java program that takes the   query string from command line and prints out search results to   standard output.  My PHP script then just fires up a shell script   that in turn runs the Java program, piping the output into PHP...Knowledge is Power. (Alvin Toffler)That's a very wonderful architecture. It seems that I was blind before  reading your e-mail. I have never thought about "shell" power before,  and using it as an interface to talk with Java. I like your point of  view. Very Interesting!Thank you very much for sharing the source code!Behzad
	
		Yahoo! Shopping 
Find Great Deals on Holiday Gifts at Yahoo! Shopping ___
PersianComputing mailing list
PersianComputing@lists.sharif.edu
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-12-04 Thread Behdad Esfahbod
On Wed, 30 Nov 2005, AmirBehzad Eslami wrote:

 Dear Behdad,

   On 25 Nov 2005, you wrote:

  Another options is to get yourself a real search engine, like
  Apache Lucene. I've written my experience using that here:
   
  http://mces.blogspot.com/2005/04/on-lucene-and-its-decency.html

 You always offer the most brilliant solutions!!
 Unfortunately, I have no experience with this mehotd. But I'm still eager.
 I read your weblog and met Apache Lucene homepage.

   I'm impressed. Would you tell us how you have integrated this
 Java-driven package with PHP at http://rira.ir/ ?!!  It works
 really fast.

That's the tricky part, or where the runtime-hell comes in.  What
I did was to write a small java program based on the samples in
Lucene to connect to my database and feed the data into Lucene.
At search time, I have another little Java program that takes the
query string from command line and prints out search results to
standard output.  My PHP script then just fires up a shell script
that in turn runs the Java program, piping the output into PHP...

I don't have access to the Java codes at this time, but the PHP
code involved is available here:

  
http://cvs.sourceforge.net/viewcvs.py/rira/rira/php/page/search.php?rev=1.1.1.1view=log


If you are developing in .NET, there is a functional port of
Lucene to .NET too.  There is even a port of an older version of
it to Python.

BTW, you need to make sure you compile it with Unicode turned on.
I don't quite remember the details, but there was some.  I also
have a Persian class written for it, but it didn't do much
anyway.  In a few weeks I will get access to rira.ir server and
hopefully move the site to the above sf.net project, so you can
see what's inside.

 Thank in advance,
   Behzad

Cheers,

--behdad
http://behdad.org/

Commandment Three says Do Not Kill, Amendment Two says Blood Will Spill
-- Dan Bern, New American Language
___
PersianComputing mailing list
PersianComputing@lists.sharif.edu
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-30 Thread AmirBehzad Eslami
Dear Behdad,On 25 Nov 2005, you wrote:   Another options is to get yourself a real search engine, like Apache Lucene. I've written my experience using that here:   http://mces.blogspot.com/2005/04/on-lucene-and-its-decency.htmlYou always offer the most brilliant solutions!!Unfortunately, I have no experience with this mehotd. But I'm still eager.I read your weblog and met "Apache Lucene" homepage.  I'm impressed. Would you tell ushow you have integrated this Java-driven package with PHP at http://rira.ir/ ?!! It worksreally fast.Thank in advance,  Behzad
		 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.___
PersianComputing mailing list
PersianComputing@lists.sharif.edu
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-28 Thread Ehsan Akhgari





  
  Dear Ehsan,You suggested a creative solution. Thank you.My 
  application, consists of a database, and two user-interfaces.The first 
  UI is used for data entry,where I parse a given XML file, extract and 
  "Romanize" itsdata - based on a "Persian-Roman Conversion Map" -and 
  then insert them into DB.Luckily, PHP provides a very fast function 
  forsuch conversions, named strtr().Now I have a "Roman 
  DB".The second UI is used for data retrieval (searching),where I 
  "Romanize" the given search argument,and look for it trough the DB 
  records. The results will bedecoded and converted to Persian, before 
  sending to stdout.
I've actually implemented this approach in a 
project. I have not yet published the code, but if you want, I can make it 
available under the GPL.

Ehsan
___
PersianComputing mailing list
PersianComputing@lists.sharif.edu
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

2005-11-27 Thread Alec . Cawley
AmirBehzad Eslami [EMAIL PROTECTED] wrote on 24/11/2005 17:48:29:

 Dear list,
 
   I'm considering programming a simple Search Engine for a website,
   to find Arabic/Persian data within a MySQL database.
   This database contains a huge amount of data, encoded with 
Unicode(UTF-8). 
 
 
   The big deal is to ** reduce the response time ** to end-users.
 
   My first solution is to create an Index and use the FULL-TEXT 
 Searching method.
 
   Luckily, MySQL's provides FULL-TEXT Indexing support in MyISAM tables.
   But unfortunately, it doesn't support multi-byte charsets (e.g. 
 Unicode). [1]
   Technically, MySQL creates Indexes over words.
   A word'' is any sequence of characters consisting of letters and 
 numbers [2].
 
   Assuming this, I tried to save the records as Unicode Character 
 References (#;), but the search failed again :-(
 
   Any suggestion?
   I appreciate any solution to solve this problem.
 
   Thanks in Advance,
   Behzad
 
 
   [1] MySQL Manual - 6.8.3 Full-text Search TODO
   [2] MySQL Manual - 6.8 MySQL Full-text Search
 
 
   P.S.

*** 
   I use MySQL 4.0
***

I think this is your problem: MySQL does not properly support Unicode 
until version 4.1. I am successfully using FullText with MySQL 4.1 to sort 
UTF-8 encoded Japanese text. I see no reason why it should not work for 
Arabic - if you upgrade.

Alec


___
PersianComputing mailing list
PersianComputing@lists.sharif.edu
http://lists.sharif.edu/mailman/listinfo/persiancomputing