This (HashDocSet, and any other impls that handle the sparse case
well) could be useful to have in Lucene's core.

For example, for certain MultiTermQuerys  we have this
CONSTANT_SCORE_AUTO_REWRITE, which has iffy smelling heuristics to try
to determine the best cutover point from
ConstantScoreQuery(BooleanQuery(<OR of Terms>)) to FILTER_REWRITE,
because FILTER_REWRITE is costly in the sparse case.

Mike

http://blog.mikemccandless.com

On Tue, Apr 5, 2011 at 10:53 AM, Jason Rutherglen
<jason.rutherg...@gmail.com> wrote:
> I think Solr has a HashDocSet implementation?
>
> On Tue, Apr 5, 2011 at 3:19 AM, Michael McCandless
> <luc...@mikemccandless.com> wrote:
>> Can we simply factor out (poach!) those useful-sounding classes from
>> Nutch into Lucene?
>>
>> Mike
>>
>> http://blog.mikemccandless.com
>>
>> On Tue, Apr 5, 2011 at 2:24 AM, Antony Bowesman <a...@thorntothehorn.org> 
>> wrote:
>>> I'm converting a Lucene 2.3.2 to 2.4.1 (with a view to going to 2.9.4).
>>>
>>> Many of our indexes are 5M+ Documents, however, only a small subset of these
>>> are relevant to any user.  As a DocIdSet, backed by a BitSet or OpenBitSet,
>>> is rather inefficient in terms of memory use, what is the recommended way to
>>> DocIdSet implementation to use in this scenario?
>>>
>>> Seems like SortedVIntList can be used to store the info, but it has no
>>> methods to build the list in the first place, requiring an array or bitset
>>> in the constructor.
>>>
>>> I had used Nutch's DocSet and HashDocSet implementations in my 2.3.2
>>> deployment, but want to move away from that Nutch dependency, so wondered if
>>> Lucene had a way to do this?
>>>
>>> Thanks
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to