Date: 2005-01-02T05:36:37 Editor: DanielNaber Wiki: Jakarta Lucene Wiki Page: LuceneFAQ URL: http://wiki.apache.org/jakarta-lucene/LuceneFAQ
no comment Change Log: ------------------------------------------------------------------------------ @@ -55,7 +55,19 @@ No, check out the [http://java-source.net/open-source/crawlers list of Open Source Crawlers in Java]. - + +==== Why am I getting an IOException that says "Too many open files"? ==== + +The number of files that can be opened simultaneously is a system-wide limitation of your operating system. Lucene might cause this problem as it can open quite some files depending on how you use it, but the problem might also be somewhere else. + + * Always make sure that you ''explicitly'' close all file handles you open, especially in case of errors. Use a try/catch/finally block to open the files, i.e. open them in the try block, close them in the finally block. Remember that Java doesn't have destructors, so don't close file handles in a finalize method -- this method is not guaranteed to be executed. + * Use the compound file format (it's activated by default starting with Lucene 1.4) by calling [http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#setUseCompoundFile(boolean) IndexWriter's setUseCompoundFile(true)] + * Don't set [http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#mergeFactor IndexWriter's mergeFactor] to large values. Large values speed up indexing but increase the number of files that need to be opened simultaneously. + * If the exception occurs during searching, optimize your index calling [http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#optimize() IndexWriter's optimize()] method after indexing is finished. + * Try to increase the number of files that can be opened simultaneously. On Linux using bash this can be done by calling `ulimit -n <number>`. + + + === Searching === @@ -95,7 +107,7 @@ ==== How do I restrict searches to only return results from a limited subset of documents in the index (e.g. for privacy reasons)? What is the best way to approach this? ==== -The QueryFilter http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/QueryFilter.html class is designed precisely for such cases. +The [http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/QueryFilter.html QueryFilter class] is designed precisely for such cases. Another way of doing it is the following: @@ -129,7 +141,7 @@ ==== Does MultiSearcher do anything particularly efficient to search multiple indices or does it simply search one after the other? ==== -`MultiSearcher` searches indices sequentially. Use ParallelMultiSearcher as a searcher that performs multiple searches in parallel. +`MultiSearcher` searches indices sequentially. Use [http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/ParallelMultiSearcher.html ParallelMultiSearcher] as a searcher that performs multiple searches in parallel. ==== Is there a way to use a proximity operator (like near or within) with Lucene? ==== --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]