Hi Chris,

While this is theoretically possible, this would require rewriting all
queries that you might want to run, so this would be a huge investment.

In general doing something like that is a bad idea since it requires
computing highlights for many documents that may not make it to the top-k
hits.

On Thu, Nov 4, 2021 at 5:44 PM Hahn, Christopher (TR Technology) <
[email protected]> wrote:

> Hello Lucene Developers,
>
> We’re working on a search service which uses lucene indexes.  One of the
> things I’m hoping to find is different places where we can plug in our
> custom classes during the search process.
>
> This first use case is for highlighting. The legacy search engine we use
> collects all term positions for highlighting during the search process. So
> everything happens all at once instead of the
> search-first-then-highlight-model.  For how we use highlighting, this is
> more efficient for us, instead of reprocessing the query.
>
> One thought I had was creating a custom scorer that would be called during
> search, and it would gather highlights in addition to scoring. I think this
> would be especially useful for proximity queries, or any other scoring
> based on positions of words in the document.  Instead of advancing the term
> vectors and finding phrases in a document at search time, and then doing it
> AGAIN at highlight time, if there was a way to access the data used by the
> search process.
>
>
>
> Any suggestions, comments, or references that would enlighten me would be
> appreciated. I’ve had great difficulty finding helpful documents as I get
> to know Lucene.
>
>
>
> Thanks,
>
> Chris Hahn
> This e-mail is for the sole use of the intended recipient and contains
> information that may be privileged and/or confidential. If you are not an
> intended recipient, please notify the sender by return e-mail and delete
> this e-mail and any attachments. Certain required legal entity disclosures
> can be accessed on our website:
> https://www.thomsonreuters.com/en/resources/disclosures.html
>


-- 
Adrien

Reply via email to