Re: Future projects

Michael McCandless Thu, 02 Apr 2009 13:11:07 -0700

I'm not sure how big a win this'd be, since the OS will cache those in
RAM and the CPU cost there (to pull from OS's cache and reprocess) is
maybe not high.


Optimizing search is interesting, because it's the wicked slow queries
that you need to make faster even when it's at the expense of wicked
fast queries.  If you make a wicked fast query 3X slower (eg 1 ms -> 3
ms), it's almost harmless in nearly all apps.

So this makes things like PFOR (and LUCENE-1458, to enable pluggable
codecs for postings) important since it addresses the very large
queries.  In fact for very large postings we should do PFOR minus the
exceptions, ie, do a simple Nbit encode, even if it wastes some bits.

Mike

On Thu, Apr 2, 2009 at 1:52 PM, Jason Rutherglen
<[email protected]> wrote:
> 4) An additional possibly contrib module is caching the results of
> TermQueries.  In looking at the TermQuery code would we need to cache the
> entire docs and freqs as arrays which would be a memory hog?
>
> On Wed, Apr 1, 2009 at 4:05 PM, Jason Rutherglen
> <[email protected]> wrote:
>>
>> Now that LUCENE-1516 is close to being committed perhaps we can
>> figure out the priority of other issues:
>>
>> 1. Searchable IndexWriter RAM buffer
>>
>> 2. Finish up benchmarking and perhaps implement passing
>> filters to the SegmentReader level
>>
>> 3. Deleting by doc id using IndexWriter
>>
>> With 1) I'm interested in how we will lock a section of the
>> bytes for use by a given reader? We would not actually lock
>> them, but we need to set aside the bytes such that for example
>> if the postings grows, TermDocs iteration does not progress to
>> beyond it's limits. Are there any modifications that are needed
>> of the RAM buffer format? How would the term table be stored? We
>> would not be using the current hash method?
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Future projects

Reply via email to