That still seems excessive. Are you measuring your first sort? Lucene builds up caches to help sort with the first few *sorts* that happen, so that's a possibility.
But if that isn't the case, I think you need to slap a profiler on the problem and see where you're spending your time. I'd also be careful about what you measure when you measure your query. For instance, I've been fooled by measuring the total time to get an assembled response and it turned out that the time was spent fetching the documents, NOT searching/sorting. Try measuring various operations. In particular comment out anything having to do with assembling the response. Perhaps just substitute in making a list of the doc IDs and time *that*. Slowly build back up to your current app, and I suspect that one of the steps will cause your time to increase dramatically. How many documents are you assembling to respond? If you're assembling 40,000 hits, then 10-20 seconds may not be unreasonable. Best Erick On Tue, Sep 23, 2008 at 12:51 AM, Ganesh - yahoo <[EMAIL PROTECTED]>wrote: > System Specification: > Processor speed: 2Ghz > Ram: 3 GB > > IndexDB size 5 GB. > Total documents indexed: 5.8 million. > > To collect hits, i have replaced Hits object with TopFieldDocs. This has > improved the search performance better. Sorting is faster on date / long > field, but it is very slow on string field. In a standalone application it > took 10 - 20 secs to dispaly the results sorted on string field. [I am not > opening indexsearcher every time]. > > Regards > Ganesh > > > > ----- Original Message ----- From: "Erick Erickson" < > [EMAIL PROTECTED]> > To: <java-user@lucene.apache.org> > Sent: Monday, September 22, 2008 6:29 PM > > Subject: Re: Exception while doing sorting > > > Sure, your tomcat instance is assigning some amount of memory >> to the JVM that your searcher is running in. Of course, now you're >> going to ask me now to increase that number... I have no idea but >> I've seen this question multiple times in the mail archive, >> so a search there or in the tomcat docs should let you know. >> >> But 12 seconds is still a long time to wait for a search to complete. >> Can you tell us more about your search? >> >> For instance, are you opening a searcher for each request? That's bad. >> Are you sorting? that can take a long time, but again the first one >> will have a performance penalty as things are cached. >> >> There are a number of tips here: >> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed >> >> Best >> Erick >> >> On Mon, Sep 22, 2008 at 7:45 AM, Ganesh - yahoo <[EMAIL PROTECTED] >> >wrote: >> >> My index crossed 5 GB and 5 million documents are indexed. >>> My query includes searching and sorting returns 40000 hits. >>> >>> If i do search from a standalone application, the results are returned in >>> 12 seconds. If i perform the same from web application running inside >>> Tomcat, out of memory exception is occured. >>> >>> Could any one clarify it? >>> >>> Regards >>> Ganesh >>> >>> ----- Original Message ----- From: "Ganesh - yahoo" < >>> [EMAIL PROTECTED] >>> > >>> To: <java-user@lucene.apache.org> >>> Sent: Friday, September 19, 2008 10:56 AM >>> >>> Subject: Re: Exception while doing sorting >>> >>> >>> Ok. If i distribure the indexes, whether sorting would be faster? >>> >>>> >>>> In Lucene user group mailing list, most emails suggests to use single >>>> indicies. Searching across the indexes may not be slower? >>>> >>>> Lucene uses FieldCache for sorting on non-tokenized field and tries to >>>> >>>>> maintain fields from all your 4 millions documents, even if you need >>>>>> to sort only 4000 docs. >>>>>> >>>>>> Don't know why Lucene keeps all terms in FieldCache for sorting. It >>>>> >>>> supposed to sort only the hits. Please clarify? >>>> >>>> Regards >>>> Ganesh >>>> >>>> ----- Original Message ----- From: "Otis Gospodnetic" < >>>> [EMAIL PROTECTED]> >>>> To: <java-user@lucene.apache.org> >>>> Sent: Thursday, September 18, 2008 12:17 PM >>>> Subject: Re: Exception while doing sorting >>>> >>>> >>>> If your index is increasing in size so fast, you should start thinking >>>> >>>>> about sharding your index (breaking it into multiple smaller indices >>>>> that >>>>> each fits on its server) and searching across them (aka distributed >>>>> search). >>>>> >>>>> Yes, Lucene can handle millions of records if run on adequate hardware >>>>> and if used correctly. >>>>> >>>>> Otis >>>>> -- >>>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >>>>> >>>>> >>>>> >>>>> ----- Original Message ---- >>>>> >>>>> From: Ganesh - yahoo <[EMAIL PROTECTED]> >>>>>> To: java-user@lucene.apache.org >>>>>> Sent: Thursday, September 18, 2008 12:53:19 AM >>>>>> Subject: Re: Exception while doing sorting >>>>>> >>>>>> My index is growing by 1 million records per day. How much memory do i >>>>>> need >>>>>> to increase. >>>>>> >>>>>> What kind of sorting algorithm being used in Lucene. Is this efficient >>>>>> enough to handle millions of records. >>>>>> >>>>>> Whether we could do sorting using our own algorithm? >>>>>> >>>>>> Regards >>>>>> Ganesh >>>>>> >>>>>> ----- Original Message ----- From: "Fuad Efendi" >>>>>> To: >>>>>> Sent: Wednesday, September 17, 2008 7:28 PM >>>>>> Subject: Re: Exception while doing sorting >>>>>> >>>>>> >>>>>> Increase memory. >>>>>> >>>>>> Lucene uses FieldCache for sorting on non-tokenized field and tries to >>>>>> maintain fields from all your 4 millions documents, even if you need >>>>>> to sort only 4000 docs. >>>>>> ============== >>>>>> http://www.tokenizer.org/bot.html >>>>>> >>>>>> >>>>>> Quoting Ganesh - yahoo : >>>>>> >>>>>> > Hello all, >>>>>> > >>>>>> > I am have indexed more than 4 million documents. My query fetches >>>>>> > 300,000 hits. If i perform sorting on any field, then tomcat reports >>>>>> > out of memory exception. >>>>>> > Sometimes the query results may be around 1000, but sorting on any >>>>>> > field might take more than 30 - 50 secs. >>>>>> > >>>>>> > I don't know what's going wrong. >>>>>> > >>>>>> > My index searcher is static object and it is getting refreshed every >>>>>> > minute. JSP pages directly calls the index searcher object and > >>>>>> performs >>>>>> > search. >>>>>> > >>>>>> > Regards >>>>>> > Ganesh >>>>>> > >>>>>> > >>>>>> > Send instant messages to your online friends >>>>>> > http://in.messenger.yahoo.com >>>>>> > >>>>>> --------------------------------------------------------------------- >>>>>> > To unsubscribe, e-mail: [EMAIL PROTECTED] >>>>>> > For additional commands, e-mail: [EMAIL PROTECTED] >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>>>> >>>>>> Send instant messages to your online friends >>>>>> http://in.messenger.yahoo.com >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>>>> >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>>> >>>>> >>>>> Send instant messages to your online friends >>>> http://in.messenger.yahoo.com >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>> >>>> >>>> Send instant messages to your online friends >>> http://in.messenger.yahoo.com >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >>> >> > Send instant messages to your online friends http://in.messenger.yahoo.com > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >