I'm not sure how big a win this'd be, since the OS will cache those in RAM and the CPU cost there (to pull from OS's cache and reprocess) is maybe not high.
Optimizing search is interesting, because it's the wicked slow queries that you need to make faster even when it's at the expense of wicked fast queries. If you make a wicked fast query 3X slower (eg 1 ms -> 3 ms), it's almost harmless in nearly all apps. So this makes things like PFOR (and LUCENE-1458, to enable pluggable codecs for postings) important since it addresses the very large queries. In fact for very large postings we should do PFOR minus the exceptions, ie, do a simple Nbit encode, even if it wastes some bits. Mike On Thu, Apr 2, 2009 at 1:52 PM, Jason Rutherglen <jason.rutherg...@gmail.com> wrote: > 4) An additional possibly contrib module is caching the results of > TermQueries. In looking at the TermQuery code would we need to cache the > entire docs and freqs as arrays which would be a memory hog? > > On Wed, Apr 1, 2009 at 4:05 PM, Jason Rutherglen > <jason.rutherg...@gmail.com> wrote: >> >> Now that LUCENE-1516 is close to being committed perhaps we can >> figure out the priority of other issues: >> >> 1. Searchable IndexWriter RAM buffer >> >> 2. Finish up benchmarking and perhaps implement passing >> filters to the SegmentReader level >> >> 3. Deleting by doc id using IndexWriter >> >> With 1) I'm interested in how we will lock a section of the >> bytes for use by a given reader? We would not actually lock >> them, but we need to set aside the bytes such that for example >> if the postings grows, TermDocs iteration does not progress to >> beyond it's limits. Are there any modifications that are needed >> of the RAM buffer format? How would the term table be stored? We >> would not be using the current hash method? > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org