Sorry. I wasn't verbose enough. I use the default memory settings. But my issue was the core structure of Lucene taking up (it seems to me) more memory than it would have to, if it had a different approach. Correct me if I'm wrong, but it seems to me that BooleanQuery stores all hits (as Bucket objects) from all terms in the query even if it is a simple war* AND wash* AND sad*. Instead of looking for wash* just in the war* hits (and then looking for sad* in the remaining hits) it makes three separate searches, which would be a waste of memory.
----- test output begins ----- Index size = 55000 Query: a* Total memory before: 2031616 Searching for: a* (org.apache.lucene.search.PrefixQuery) Total memory after: 55128064 53527 total matching documents (1984ms) Query: e* Total memory before: 55128064 Searching for: e* (org.apache.lucene.search.PrefixQuery) Total memory after: 55128064 52456 total matching documents (984ms) Query: a* AND e* Total memory before: 55128064 Searching for: +a* +e* (org.apache.lucene.search.BooleanQuery) Total memory after: 124882944 51267 total matching documents (2468ms) ----- test output ends ----- In my perfect world the memory allocation, when searching for a* AND e*, should not increase at all after the both separate searches a* and e*, cause it would just allocate space for a*-hits, and ignoring e*-hits that has no previous hit. My biggest index lies at 2,34 million documents during testing, but should grow with approximately 10000 docs/day in production. With that figure I wish for the best possible memory handling. At the moment we use a search engine that, given the right question (or wrong), consumes memory like a starving wolf and crashes the whole thing. The search engine should be able to play with about 1GB RAM on the machine. I just don't want the same possibilities of a crash with Lucene too. I want to know if the Lucene developers feel that there are things to optimize or if they have done everything like it should be from the start ? thanks /RW > -----Ursprungligt meddelande----- > Från: Otis Gospodnetic [mailto:[EMAIL PROTECTED] > Skickat: den 19 mars 2003 16:19 > Till: [EMAIL PROTECTED] > Ämne: Re: OutOfMemoryError with boolean queries > > > Robert, > > I'm moving this to lucene-user, which is a more appropriate list for > this type of a problem. > You are not saying whether you are using some of those handy -X (-Xms > -Xmx) command line switches when you invoke your application that dies > with OutOfMemoryError. > If you are not, try that, it may help. I recall a few other people > reporting the same problem and using -Xms and -Xmx solved their > problem. > > If your machine doesn't have the RAM it needs this won't help, of > course :) > > Otis > > > --- Robert_Wennström <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I'm experiencing OutOfMemoryErrors when searching using many logical > > ANDs > > combined with prefix queries. > > The reason is clearly too many returned hits waiting for combined > > evaluation. > > > > I was wondering if there are any thoughts of changing the search > > approach to > > something less memory consuming. > > This is quite a big problem even with small sets of documents. > > > > Example: > > my 55000 document index runs out of memory when searching > for a* AND > > e* > > > > Could you estimate the difficulty to change the behaviour to search > > for e* in > > just the hits matching a* ? > > I'm about to put a BitSet at the innermost hit collection > to sort out > > AND-clauses that hasn't been matched by previous AND-clauses. > > Is there a better approach ? > > > > > > Thanks for a great java project guys > > > > ____________________________________________ > > Robert Wennström [developer, netadmin] > > robert -at- agent25.com > > www.agent25.com > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > __________________________________________________ > Do you Yahoo!? > Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop! > http://platinum.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
