Hi,

time2k wrote:
Q1:Is Lucene "Query Language" supports UTF-8 character set? I try to input
some chinese words but It returned Zend_Search_Lucene_Search_Query_Empty
Object. For instead now I use MultiTerm,but I think using "Query Language"
is better than it:)

Zend_Search_Lucene works with UTF-8 internally. The engine itself is completely UTF-8 compatible. But it has some limitations in text analyzing (analyzers are used to tokenize text while indexing and query parsing).

More detailed information can be found in the "Zend_Search_Lucene. Character set." documentation chapter (http://framework.zend.com/manual/en/zend.search.lucene.charset.html) and in the "Zend_Search_Lucene. Best practice. Encoding" chapter of current ZF SVN version (take last nightly snapshot to get it).


It also planned to make some improvements in Utf8 analyzer in some future (behavior and performance). It will be PCRE based and will depend on PCRE unicode support.

But I am not sure, if PCRE recognizes Chinese letters as "letters" (\pL pattern). Would you like to be a first tester? :)


Q2:zend_search_lucene supports range query, but i can't use it because
Q1...Is some API class or functions such as Java Lucene 's RangeQuery()?

Yes.
"Zend_Search_Lucene. Query Construction API. Range Query." documentation section (SVN version).
An example:
------------------------
$fromTerm = new Zend_Search_Lucene_Index_Term('20020101', 'mod_date');
$toTerm   = new Zend_Search_Lucene_Index_Term('20030101', 'mod_date');
$query = new Zend_Search_Lucene_Search_Query_Range($from, $to, true /* inclusive */);
$hits  = $index->find($query);
------------------


With best regards,
   Alexander Veremyev.

Reply via email to