Hello,
I've been having similar problems with ZSL as well, but whilst I haven't found a quick solution, I've found that going back to the index itself and understanding what is going on will prove useful - depending on what kinds of documents you're indexing and what kinds of searches you're running, you may find that is part of the problem.

Without knowing the fields in your index, including the types of the fields, and the types of words in your documents (or even what kinds of documents you have), it's very hard to give any specific advice.

The memory usage seems to come from the size of the word list itself, and bear in mind that the word list will easily run into the thousands of unique words. The way the default analyser handles it, things like "I'm" and "There's" will be split at the apostrophe, so the index will be holding the word 's' as a word. Things like this will easily expand the size of the index, and massively increase the memory overhead. Additionally if you are able to keep the input query small, with as few but more unique search terms as possible, this will keep memory usage low.

To really reduce memory, depending on whether you have the resources and time to do so, it may be worth investigating writing a custom analyser, geared towards the kinds of words you have in your database. In mine, for example, I have multiple variant spellings of words on input - my index holds written works, where the original author uses unusual constructions (e.g. making s into sh to demonstrate that the character is drunk), so by doing some analysis on that, I've been able to trim the index down and keep its memory usage down.

Another way is to avoid using Keyword fields where possible, and switching them to some tokenised form, assuming the data can be suitably indexed and isn't needed to be kept as Keyword.

Stripping out really common words might also help, but that's only really best if you're not using the indexed text to be able to display the results.

If you are able to provide a few more details about your index, I might be able to give you a few better pointers.

Regards
Pete



Alex wrote:
Hi,

I'm having serious memory problems with ZSL. My current index holds around 400 thousand documents.

If I run a search for a term with about 500 results, ZSL returns the best results first but uses a tremendous amount of memory.

If I use $index->setResultSetLimit(), I decrease memory usage significantly, but I get some very poor first results.

Thanks for your help!


- Alex

Reply via email to