Re: [fw-general] Two questions about zend_search_lucene

Alexander Veremyev Thu, 19 Jul 2007 13:42:12 -0700

Hi,

time2k wrote:

Q1:Is Lucene "Query Language" supports UTF-8 character set? I try to input
some chinese words but It returned Zend_Search_Lucene_Search_Query_Empty
Object. For instead now I use MultiTerm,but I think using "Query Language"
is better than it:)

Zend_Search_Lucene works with UTF-8 internally. The engine itself iscompletely UTF-8 compatible.But it has some limitations in text analyzing (analyzers are used totokenize text while indexing and query parsing).

More detailed information can be found in the "Zend_Search_Lucene.Character set." documentation chapter(http://framework.zend.com/manual/en/zend.search.lucene.charset.html)and in the "Zend_Search_Lucene. Best practice. Encoding" chapter ofcurrent ZF SVN version (take last nightly snapshot to get it).

It also planned to make some improvements in Utf8 analyzer in somefuture (behavior and performance). It will be PCRE based and will dependon PCRE unicode support.

But I am not sure, if PCRE recognizes Chinese letters as "letters" (\pLpattern). Would you like to be a first tester? :)

Q2:zend_search_lucene supports range query, but i can't use it because
Q1...Is some API class or functions such as Java Lucene 's RangeQuery()?


Yes.

"Zend_Search_Lucene. Query Construction API. Range Query." documentationsection (SVN version).

An example:
------------------------
$fromTerm = new Zend_Search_Lucene_Index_Term('20020101', 'mod_date');
$toTerm   = new Zend_Search_Lucene_Index_Term('20030101', 'mod_date');

$query = new Zend_Search_Lucene_Search_Query_Range($from, $to, true /*inclusive */);

$hits  = $index->find($query);
------------------


With best regards,
   Alexander Veremyev.

Re: [fw-general] Two questions about zend_search_lucene

Reply via email to