Search speed
I'm looking for tips on speeding up searches since I am a relatively new user of Lucene. I've created a single index with 4.5 million documents. The index has about 22 fields and one of those fields is the contents of the body tag which can range from 5K to 35K. When I create the field (named contents) that houses the contents of the body tag, the field is stored, indexed, and tokenized. The term position vectors are not stored. Single word searches return pretty fast, but when I try phrases, searching seems to slow considerably. When constructing the query I am using the standard query object where analyzer is the StandardAnalyzer: Code Example: Query objQuery = QueryParser.parse(sSearchString, contents, analyzer); For example, the following query, contents:Zanesville, it returns over 163,000 hits in 78 milliseconds. However, if I use this query, contents:all parts including picture tube guaranteed, it returns hits in 2890 millseconds. Other phrases take longer as well. My question is, are there any indexing tips (storing term vectors?) or query tips that I can use to speed up the searching of phrases? Thanks in advance for any tips. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
MultiSearcher object question
I've just indexed over 600,000 documents (index size = 12GB) and have a simple servlet to search the index. I am using the MultiSearcher object (I will add more indexes in the future) in a servlet to test searching. I have noticed that the instantiation of my MulitSearcher object is taking about 5 seconds. As a solution, I have created the MultiSearcher object and stored it in the Application context so I create it once and access it subsequent times. My question is, is this a recommended practice? If I have 1000 users concurrently searching, will this approach cause problems? What do others do in web applications using the MultiSearcher object? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Indexing Strategy for 20 million documents
I am a new user of Lucene. I am looking to index over 20 million documents (and a lot more someday) and am looking for ideas on the best indexing/search strategy. Which will optimize the Lucene search, one index or multiple indexes? Do I create multiple indexes and merge them all together? Or do I create multiple indexes and search on the multiple indexes? Any helpful ideas would be appreciated! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]