This (HashDocSet, and any other impls that handle the sparse case well) could be useful to have in Lucene's core.
For example, for certain MultiTermQuerys we have this CONSTANT_SCORE_AUTO_REWRITE, which has iffy smelling heuristics to try to determine the best cutover point from ConstantScoreQuery(BooleanQuery(<OR of Terms>)) to FILTER_REWRITE, because FILTER_REWRITE is costly in the sparse case. Mike http://blog.mikemccandless.com On Tue, Apr 5, 2011 at 10:53 AM, Jason Rutherglen <jason.rutherg...@gmail.com> wrote: > I think Solr has a HashDocSet implementation? > > On Tue, Apr 5, 2011 at 3:19 AM, Michael McCandless > <luc...@mikemccandless.com> wrote: >> Can we simply factor out (poach!) those useful-sounding classes from >> Nutch into Lucene? >> >> Mike >> >> http://blog.mikemccandless.com >> >> On Tue, Apr 5, 2011 at 2:24 AM, Antony Bowesman <a...@thorntothehorn.org> >> wrote: >>> I'm converting a Lucene 2.3.2 to 2.4.1 (with a view to going to 2.9.4). >>> >>> Many of our indexes are 5M+ Documents, however, only a small subset of these >>> are relevant to any user. As a DocIdSet, backed by a BitSet or OpenBitSet, >>> is rather inefficient in terms of memory use, what is the recommended way to >>> DocIdSet implementation to use in this scenario? >>> >>> Seems like SortedVIntList can be used to store the info, but it has no >>> methods to build the list in the first place, requiring an array or bitset >>> in the constructor. >>> >>> I had used Nutch's DocSet and HashDocSet implementations in my 2.3.2 >>> deployment, but want to move away from that Nutch dependency, so wondered if >>> Lucene had a way to do this? >>> >>> Thanks >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org