Hi Garth,
Garth Gillespie wrote:
Hi all,
I'm trying Zend Search with about 40K records and retrieval time for large
result sets is very slow.
Sample result sets query times:
542 results - 14 seconds
2174 results - 90 seconds
Search and result retrieving time depend on a lot of factors.
I have tried both optimized and unoptimized lucene segments and the search
time remains the same. The index path is using /dev/shm, so the load time
should be quick. PHP is allowed to use plenty of memory (48M).
The largest segment is about 5MB.
Zend_Search_Lucene stores in memory only term dictionary index (usually
it's each 128th term in a dictionary). It doesn't take a lot of memory.
Then it uses document matching map. It's 40K/8 bytes = 5Kb in your case,
if 'bitset' extension is turned on, or array of 2174 integers if
'bitset' extension is not used.
Thus, it also doesn't take much memory.
The records I am adding to the index are all simple text, no html markup.
What is the overhead to always having to load the index, as opposed to
having an in memory process? Is this the slowdown I am experiencing?
Zend_Search_Lucene object creation time doesn't depend on a query and
only loads some index structure information.
Should I be experimenting with segment size? Indexing takes over an hour
for the 40K records.
Optimized index (one index segment) always gives the best result.
Please describe, which fields do you have in the index (or give an
example of your indexing script).
Which query do you use?
Which script do you use to process search result?
With best regards,
Alexander Veremyev.