Hi Kris,
The problem is in the large result set. Score calculation takes large
enough time.
Two hints:
1. Don't retrieve any stored field for results you don't need.
Retrieving any stored field invokes complete document loading. It's
time-expensive operation.
2. Result set limitation functionality is included into
Zend_Search_Lucene now.
"Zend_Search_Lucene::setResultSetLimit($limit)" limits result set size.
It doesn't return "best N" results, but simply "first N". Nevertheless
it may be useful to limit search time.
It's available via SVN and nightly snapshots -
http://framework.zend.com/download/snapshot/ and is going to be included
into ZF v1.1.0.
PS Default search field may be set with
Zend_Search_Lucene::setDefaultSearchField($fieldName);
null means, that search is performed through all fields by default.
With best regards,
Alexander Veremyev.
Kris Jurka wrote:
Kris Jurka wrote:
[Different results/performance with PHP vs Java Lucene]
Some more testing has revealed that PHP was doing a search over all
index parts while Java was just using the contents. Additional the Java
version considered "there" a stop word, so now comparing apples to apples:
./searchtest.php 'contents:hey contents:good contents:looking'
Search in: 4.2395761013
Hits: 38494
$ java -classpath lucene-core-2.2.0.jar:. SearchTest 'contents:hey
contents:good contents:looking'
Search in: 0.29
Hits: 38494
Kris Jurka