Re: How to improve document retrieval speed.

eks dev Sat, 04 Nov 2006 11:15:05 -0800

I would strongly suggest not storing these fields in lucene, just keep them as 
files and store some kind of url to get them latter. that will boost your speed 
heavily. If you really, really need to store documents in lucene, try some 
compression
Also, so many fields hurt performance, any chance you could pack some of them 
together?

Did you have  a chance to read Lucene in Action, could help you greatly. 

good luck

----- Original Message ----
From: Sunil Kumar PK <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Saturday, 4 November, 2006 4:12:57 PM
Subject: Re: How to improve document retrieval speed.

Hi,

I am using Lucene's Remote Parallel Multisearcher with 10 nodes in my search
cluster having 200+ distributed index fragments (1 Index fragment = 4GB). I
have 30+ fields in my index, and I am storing a master XML file (contains 5
to 30 pages of information) in one field. I also  have two web servers
behind a load-balancer.

My system works as follows.

I) When the user search for a document, the search result page will display
3 stored fields from the index (Document Number,Title and Date).

II) On clicking a particular result I am searching again(Since I am using a
load-balancer I can't use the same RemoteSearchable object) for that
particular 'document number' and retrieve the stored master XML from the
index.

The first step is ok with me, but I want to improve the speed in the second
part. Is it necessary to do the scoring and weight calculation in the second
step?

Thanks,
Sunil

On 11/4/06, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
>
> You probably can skip the QueryParser part and just construct a
> TermQuery with your term and field.  That will save you a few ticks.
> I'm betting you have just included the code below for example, so
> this may not apply, however, you want to make sure you aren't
> creating the IndexSearcher every time you want to run the query.  See
> the list archives for info on warming/caching the searcher.
>
> What kind of runtimes are you experiencing?
>
> On Nov 4, 2006, at 6:46 AM, Sunil Kumar PK wrote:
>
> > Hi,
> >
> > In my index there is a unique field, "MY_DOCNO".
> >
> > If I want get a document from the index with MY_DOCNO=1000,  I am
> > using
> > following code,
> >
> > IndexSearcher isearcher = new IndexSearcher("myindex1");
> > QueryParser qp = new QueryParser("MY_DOCNO", new StandardAnalyzer());
> >
> > Query query = qp.parse("MY_DOCNO:1000");
> > Hits hits = isearcher.search(query);
> >
> > Since I have a unique field in my index, is there any other method,
> > that can
> > retrieve the document faster than this?
> >
> > Thanks,
> > Sunil
>
> --------------------------
> Grant Ingersoll
> Sr. Software Engineer
> Center for Natural Language Processing
> Syracuse University
> 335 Hinds Hall
> Syracuse, NY 13244
> http://www.cnlp.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

___________________________________________________________ 
All New Yahoo! Mail – Tired of [EMAIL PROTECTED]@! come-ons? Let our SpamGuard 
protect you. http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to improve document retrieval speed.

Reply via email to