Re: Caching of TermDocs

John Patterson Tue, 27 Jul 2004 08:06:21 -0700

Cool.  I'll give it a try.  Looks like extending FilterIndexReader is the
way to go.  Or possibly I could cache the compressed form at a lower level
getting the best of both worlds.  I'll look into both ways, profile the app,
and post my results.


----- Original Message ----- 
From: "Doug Cutting" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Tuesday, July 27, 2004 8:33 PM
Subject: Re: Caching of TermDocs


> John Patterson wrote:
> > I would like to hold a significant amount of the index in memory but use
the
> > disk index as a spill over.  Obviously the best situation is to hold in
> > memory only the information that is likely to be used again soon.  It
seems
> > that caching TermDocs would allow popular search terms to be searched
more
> > efficiently while the less common terms would need to be read from disk.
>
> The operating system already caches recent disk i/o.  So what you'd save
> primarily would be the overhead of parsing the data.  However the parsed
> form, a sequence of docNo and freq ints, is nearly eight times as large
> as its compressed size in the index.  So your cache would consume a lot
> of memory.
>
> Whether it this provide much overall speedup depends on the distribution
> of common terms in your query traffic.  If you have a few terms that are
> searched very frequently then it might pay off.  In my experience with
> general-purpose search engines this is not usually the case: folks seem
> to use rarer words in queries than they do in ordinary text.  But in
> some search applications perhaps the traffic is more skewed.  Only some
> experiments would tell for sure.
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Caching of TermDocs

Reply via email to