Re: OUTOFMEMORY ERROR

Erik Hatcher Thu, 07 Jul 2005 12:50:34 -0700


On Jul 7, 2005, at 1:12 PM, MariLuz Elola wrote:

Hi Erik, excuse me for all my questions. Thank you very much foryour speedy answers, and sorry for my bad english.
I am spanish and I don´t speak english very well.
Well, I have one question more.
Finally I am using IndexReader to return all the documents:
Directory directory = FSDirectory.getDirectory(path,false);
               IndexReader reader = IndexReader.open(directory);
       for (int start = base; start < end; start++) {
           Document doc = reader.document(start);
String id=doc.get(es.seinet.xtent.searchEngine.lucene.general.Util.ID);
           ides.add(id);
       }
It works fine and speedy. The only problem is that it is impossibleto sort the results by some metadata (gets all the documents orderby title, for example).

If you truly need to have a Query that can find all documents, thenadd a special field to each document with a fixed value such asdoc:yes and then do a TermQuery for doc:yes. You could then leverageLucene's sorting capability.

My question is about the parameter maxClauseCount. I think the samethat you. It is not a good idea bump up the limit...If I use the default vale (1024) and I search, I am getting thiserror:[SearchCollection,executeQuery] caught a classorg.apache.lucene.search.BooleanQuery$TooManyClauses
with message: null
Are there any way to search all the documents (210.000 documents)and internally works only with 1024, returns documents until 1024and not get the toomanyclauses error??? I need to work efficientlywith collections of more than 250.000 regitries, and the usersnormally does complex querys (ej: DATE:[20050601 to 20050701] ANDTITLE:Lucene* ...... ect....)

The issue is that PrefixQuery, WildcardQuery, RangeQuery, andFuzzyQuery all expand to the terms that match in a BooleanQuery ORfashion. You need to identify what terms those are and address themindividually. I can't offer specific advice since I don't know whatfields you're using and what values they may contain. But oneexample is with dates. If you index dates and do it at themillisecond granularity but you really only need to query by YEARthen there is a great chance one of those query types will expand toTooManyClauses. If, instead, you indexed dates by YYYY when all youneed is year granularity then you have far fewer terms. I hope thismakes sense and helps.


    Erik

Re: OUTOFMEMORY ERROR

Reply via email to