Re: How to get all matched terms in a PrefixQuery

Rajnish kamboj Tue, 13 Sep 2016 20:11:28 -0700

Thanks Mike

I would rather go with first approach with Scorer.getChildren API. (will
try).
The second approach I have thought of but you are right, it is costly.


Regards
Raj

On Wednesday 14 September 2016, Michael McCandless <
[email protected]> wrote:

> You can't do this very easily, unfortuantely.
>
> The way PrefixQuery runs is to find (globally, across the index) all
> terms that have that prefix.  If there are enough of them, it goes
> term by term marking the documents in a bitset, and then iterates that
> bitset in the end.  So the information of which term matched which
> document is long gone.
>
> If there are few enough terms, it makes a BooleanQuery with N SHOULD
> clauses, and in that limited case, since the child clauses are all
> visiting the same document when it's collected, you might be able to
> use the Scorer.getChildren API in a custom Collector to see (per doc
> collected) which terms are "on" that one document.
>
> You could alternatively store term vectors (but these are slow and
> costly) and load them for each document and iterate the matched prefix
> terms by creating a PrefixTermsEnum.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Sep 13, 2016 at 11:25 AM, Rajnish kamboj
> <[email protected] <javascript:;>> wrote:
> > Hi
> >
> > How can I get all matched terms of a document in PrefixQuery?
> >
> > Term t2 = new Term("contents", "br");
> > PrefixQuery query = new PrefixQuery(t2);
> >
> > Suppose I have few documents with 1000 different terms.
> > Search is showing me the document in which it find the br words.
> >
> > Now, how can I get all the br words in the document?
> >
> >
> >
> > Thanks
> > Raj
>

Re: How to get all matched terms in a PrefixQuery

Reply via email to