Hi,
Consider a Query with e.g. 4 terms (t1,t2,t3,t4). I want to retrieve all
documents which contain at least e.g. 3 of the queryterms. How can I
implement this?
The first idea is to use BooleanQueries such as
(t1 and t2 and t3 and t4) or (t1 and t2 and t3) or(t1 and t2 and t4) or
(t1 and t3 and t4).....
But the perfomance is not very good when I have 20 queryterms and I want
to retrieve all docs which contain at least 15 of the terms.
Can I modify the skipto-algorithm in ConjunctionScorer in order to
achieve this?
Thanks
Barbara
PS: Has anybody written a Statistics-class which says how many term and
different terms are in the index. And perhaps computes the mean
length of the documents in the index with the standard deviation?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]