Re: Improving search performance

Jason Rutherglen Sat, 24 May 2008 08:22:16 -0700

There needs to be a solution to that problem.  I noticed it several years
ago which is why ever since have designed systems using MultiSearcher
concepts.  There should only be one instance of deleted docs per IndexReader
now that there is reopen.  Editing the live deleted docs does not seem like
something most people do.  One should be able to delete docs, flush, then
have to reopen to get the changes.  Or this should be one option by
extending SegmentReader.  Also think it is important to keep the ability to
delete docs using an open IndexReader rather than deprecate it because
realtime search systems cannot switch between IndexReaders and IndexWriters.


On Fri, May 23, 2008 at 11:24 PM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:

> Hi Emmanuel,
>
> Because there are some synchronized methods, like the one that checks
> whether a doc is deleted, that get called during search.  If you have a pile
> of threads (op. p. mentioned 100 threads) there could be contention around
> those methods.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
> ----- Original Message ----
> > From: Emmanuel Bernard <[EMAIL PROTECTED]>
> > To: [email protected]
> > Sent: Friday, May 23, 2008 6:41:36 PM
> > Subject: Re: Improving search performance
> >
> > Hi
> > Hibernate Search does not pool the Searcher but pools the underlying
> > IndexReader(s). From what i've seen, a Searcher is stateless and all
> > the state is kept in the Readers. so this essentially is equivalent to
> > reusing the searcher.
> >
> > Out of curiosity why is a pool of Searcher more efficient?
> >
> > Emmanuel
> >
> > On  May 22, 2008, at 13:22, Otis Gospodnetic wrote:
> >
> > > Some quick feedback.  Those are all very expensive queries
> > > (wildcards and ranges).  The first thing I'd do is try without
> > > Hibernate Search (to make sure HS is not the bottleneck).  100
> > > threads is a lot, I'm guessing you are reusing your searcher, which
> > > is good, but you will actually improve performance a bit if you work
> > > with a small pool of searchers instead of a single searcher.
> > >
> > > Otis
> > > --
> > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > >
> > >
> > > ----- Original Message ----
> > >> From: Rakesh Shete
> > >> To: [EMAIL PROTECTED]; [email protected]
> > >> Sent: Thursday, May 22, 2008 1:16:13 PM
> > >> Subject: Improving search performance
> > >>
> > >>
> > >> Hi all,
> > >>
> > >> I have index of size 85MB. My query looks as follows:
> > >>
> > >> +(t:boss* d:boss* dd:boss* tg:boss*) +st:act +ntid:0 +cid:1 +dr:
> > >> [20080410 TO
> > >> 20081010] +rT:[002 TO 005]
> > >>
> > >> All the fields used in the query are stored in the indexes (Indexed
> > >> & Stored)
> > >>
> > >> The query response time for me is around 30 seconds when running
> > >> mutliple
> > >> simultanoeous threads (~100). The no. of matches is ~30k but I
> > >> retrieve only the
> > >> top 100 results. I am using Hibernate Search which is a wrapper
> > >> around Lucene. I
> > >> retrieve the "id" filed from the index which is also indexex and
> > >> stored.
> > >>
> > >> What is the approach that I should take for improving the
> > >> performance?
> > >>
> > >> Will just indexing the values without storing them work (Index &
> > >> UnStored)?
> > >>
> > >> My machine configuration is:
> > >> P4 2.66GHz 1.99 GB RAM
> > >>
> > >> The code for searching runs in JBoss application server which has a
> > >> maximum heap
> > >> size of 1024MB. When these 100 threads are running in the
> > >> application server the
> > >> CPU utilization is 100% and JBoss consumes all of the heap size.
> > >>
> > >> Any pointers on index optimization would be really appreciated.
> > >>
> > >> --Regards,
> > >> Rakesh Shete
> > >>
> > >> _________________________________________________________________
> > >> No Harvard, No Oxford. We are here. Find out !!
> > >> http://ss1.richmedia.in/recurl.asp?pid=500
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: Improving search performance

Reply via email to