Re: Highlighters, accurate highlighting, and the PostingsHighlighter

[email protected] Fri, 10 Oct 2014 07:16:33 -0700

On Fri, Oct 10, 2014 at 6:39 AM, Michael McCandless <
[email protected]> wrote:


> +1 for a "completely accurate" (each snippet shown matches the query)
> and fast highlighter, but it's a real challenge because you need a
> clean way to recursively iterate all positions for any (even
> non-positional) queries (what LUCENE-2878 will give us).  To properly
> handle your (+A +B) (+C +D) example, you'd need BooleanQuery to
> participate in enumerating the positions...
>

My plan for that is to convert TermQueries to something similar that gets a
docsAndPositionsEnum (with offsets) instead of a plain DocsEnum.  The code
that navigates the graph can cast it to get what it needs.  Alternatively,
I thought perhaps I might wrap the IndexReader on down with pass-throughs
but ensure that you always get positions (with offsets) even when you don’t
ask for it, and then I’ll keep track of each instance for retrieval later.
Though somehow I’d need to map the Query to the tracked positions
enumerators, and this sounds like more work so I probably won’t go this
route.

I plan to convert the Query tree to an equivalent (for highlighter
purposes) comprised of BooleanQuery, TermQuery (some custom similar one,
actually), MultiTermQueries (again, some custom variant), and SpanQueries —
phrase queries get converted to those.

~ David

Re: Highlighters, accurate highlighting, and the PostingsHighlighter

Reply via email to