Hello!

I worked on the UnifiedHighlighter a lot and want to help you!

On Mon, Jan 11, 2021 at 9:58 AM Shaun Campbell <campbell.sh...@gmail.com>
wrote:

> I've been using highlighting for a while, using the original highlighter,
> and just come across a problem with fields that contain a large amount of
> text, approx 250k characters. I only have about 2,000 records but each one
> contains a journal publication to search through.
>
> What I noticed is that some records didn't return a highlight even though
> they matched on the content. I noticed the hl.maxAnalyzedChars parameter
> and increased that, but  it allowed some records to be highlighted, but not
> all, and then it caused memory problems on the server.  Performance is also
> very poor.
>

I've been thinking hl.maxAnalyzedChars should maybe default to no limit --
it's a performance threshold but perhaps better to opt-in to such a limit
then scratch your head for a long time wondering why a search result isn't
showing highlights.


> To try to fix this I've tried  to configure the unified highlighter in my
> solrconfig.xml instead.   It seems to be working but again I'm missing some
> highlighted records.
>

There is no configuration of that highlighter in solrconfig.xml; it's
entirely parameter driven (runtime).


> The other thing is I've tried to adjust my unified highlighting settings in
> solrconfig.xml and they don't  seem to be having any effect even after
> restarting Solr.  I was just wondering whether there is any highlighting
> information stored at index time. It's taking over 4hours to index my
> records so it's not easy to keep reindexing my content.
>
> Any ideas on how to handle highlighting of large content  would be
> appreciated.
>
> Shaun
>

Please read the documentation here thoroughly:
https://lucene.apache.org/solr/guide/8_6/highlighting.html#the-unified-highlighter
(or earlier version as applicable)
Since you have large bodies of text to highlight, you would strongly
benefit from putting offsets into the search index (and re-index) --
storeOffsetsWithPositions.  That's an option on the field/fieldType in your
schema; it may not be obvious reading the docs.  You have to opt-in to
that; Solr doesn't normally store any info in the index for highlighting.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

Reply via email to