Re: Improving Index Search Performance

2008-03-26 Thread Shailendra Mudgal
Hi All, Thanks for your reply. I would like to mention here is that the companyId is a multivalued field. I tried paul's suggestions also but doesn't seem much gain. Still the searcher.doc() method is taking almost the same amount of time. you can use the FieldCache to lookup the compnayId for

Re: Improving Index Search Performance

2008-03-26 Thread Ian Lea
Hi The bottom line is that reading fields from docs is expensive. FieldCache will, I believe, load fields for all documents but only once - so the second and subsequent times it will be fast. Even without using a cache it is likely that things will speed up because of caching by the OS. If

Re: Improving Index Search Performance

2008-03-26 Thread Toke Eskildsen
On Wed, 2008-03-26 at 10:45 +, Ian Lea wrote: If you've got plenty of memory vs index size you could look at RAMDirectory or MMapDirectory. Or how about some solid state disks? Someone recently posted some very impressive performance stats. That was probably me. A (very) quick test for

Re: Improving Index Search Performance

2008-03-26 Thread Shailendra Mudgal
The bottom line is that reading fields from docs is expensive. FieldCache will, I believe, load fields for all documents but only once - so the second and subsequent times it will be fast. Even without using a cache it is likely that things will speed up because of caching by the OS. As i

Re: Improving Index Search Performance

2008-03-26 Thread Ian Lea
Well, caching is designed to use memory. If you are saying that you haven't got enough memory to cache all your values then caching them all isn't going to work, at any level. If you implemented your own cache you could control memory usage with an LRU algorithm or whatever made sense for your

Re: Improving Index Search Performance

2008-03-26 Thread Paul Elschot
Since you're using all the results for a query, and ignoring the score value, you might try and do the same thing with a relational database. But I would not expect that to be much faster, especially when using a field cache. Other than that, you could also go the other way, and try and add more

Improving Index Search Performance

2008-03-25 Thread Shailendra Mudgal
Hi Everyone, We are using Lucene to search on a index of around 20G size with around 3 million documents. We are facing performance issues loading large results from the index. Based on the various posts on the forum and documentation, we have made the following code changes to improve the

Re: Improving Index Search Performance

2008-03-25 Thread Toke Eskildsen
On Tue, 2008-03-25 at 18:13 +0530, Shailendra Mudgal wrote: We are using Lucene to search on a index of around 20G size with around 3 million documents. We are facing performance issues loading large results from the index. [...] After all these changes, it seems to be taking around 90 secs

Re: Improving Index Search Performance

2008-03-25 Thread Paul Elschot
Shailendra, Have a look at the javadocs of HitCollector: http://lucene.apache.org/java/2_3_0/api/core/org/apache/lucene/search/HitCollector.html The problem is with the use of the disk head, when retrieving the documents during collecting, the disk head has to move between the inverted index and

Re: Improving Index Search Performance

2008-03-25 Thread Chris Hostetter
: *We also read in one of the posts that we should use bitSet.set(doc) : instead of calling searcher.doc(id). But we are unable to to understand how : this might help in our case since we will anyway have to load the document : to get the other required field(company_id). Also we observed that