By passing Integer.MAX_VALUE you are requesting Lucene to allocate a priority queue for collecting results with that size, this OOMs. With Lucene if you are using TopDocs, the idea is to only get a limited amount of Top-Ranking documents to display search results. The user is not interested in the 2 million's result page, so pass a small number of top hits.
To simply count all hits like you seem to do, there is a separate collector available: http://goo.gl/XsPVR ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Benson Margulies [mailto:[email protected]] > Sent: Sunday, February 19, 2012 3:22 PM > To: [email protected] > Subject: Counting all the hits with parallel searching > > If I have a lot of segments, and an executor service in my searcher, the > following runs out of memory instantly, building giant heaps. Is there another > way to express this? Should I file a JIRA that the parallel code should have > some > graceful behavior? > > int longestMentionFreq = searcher.search(longestMentionQuery, filter, > Integer.MAX_VALUE).totalHits + 1; > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
