Hi, > ...I get around 3 > million hits. Each of the hits is processed and information from a certain > field is > used.
Thats of course fine, but: > After certain number of hits, somewhere around 1 million (not always the same > number) I get OutOfMemory exception that looks like this: You did not tell us *how* you get the hits. If you do something like Searcher.search(query, 1000000) that it can easily memory overflow (sooner or later, maybe on decompressing results maybe somewhere else). Lucene always collects "top-ranking" results and for doing that it uses a priority queue. With the above command (passing 1 million or more as number of top-ranking results, this will use insane amounts of memory). Like most full text search engines, Lucene is optimized for quickly getting the best results. The use-case of fetching *all* possible hits is not really the correct use case of a full text search engine (especially as hits that far at the end are in most cases no more relevant to your query). To really collect all hits (but in arbitrary order, not sorted by relevance), write your own Collector implementation that collects the results and pass it to searcher. There are several code sample on this mailing list. Another approach is to use the new "sortAfter" method, available in the next Lucene version (not yet released). Uwe --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org