On Tue, Jan 12, 2021 at 9:39 AM Shaun Campbell <campbell.sh...@gmail.com>
wrote:

> Hi David
>
> First of all I wanted to say I'm working off your book!!  Third edition,
> and I think it's a bit out of date now. I was just going to try following
> the section on the Postings highlighter, but I see that's been absorbed
> into the Unified highlighter. I find your book easier to follow than the
> official documentation though.
>

Thanks :-D.  I do maintain the Solr Reference Guide for the parts of code I
touch, including highlighting, so I hope what's there makes sense too.


> I am going to try to configure the unified highlighter, and I will add that
> storeOffsetsWithPositions to the schema (which I saw in your book) and I
> will try indexing again from scratch.  Was getting some funny things going
> on where I thought I'd turned highlighting off and it was still giving me
> highlights.
>

hl=true/false


> Actually just re-reading your email again, are you saying that you can't
> configure highlighting in solrconfig.xml? That's where I always configure
> original highlighting in my dismax search handler. Am I supposed to add
> highlighting to each request?
>

You can set highlighting and other *parameters* in solrconfig.xml for
request handlers.  But the dedicated <highlighting> plugin info is only for
the original and Fast Vector Highlighters.

~ David


>
> Thanks
> Shaun
>
> On Mon, 11 Jan 2021 at 20:57, David Smiley <dsmi...@apache.org> wrote:
>
> > Hello!
> >
> > I worked on the UnifiedHighlighter a lot and want to help you!
> >
> > On Mon, Jan 11, 2021 at 9:58 AM Shaun Campbell <campbell.sh...@gmail.com
> >
> > wrote:
> >
> > > I've been using highlighting for a while, using the original
> highlighter,
> > > and just come across a problem with fields that contain a large amount
> of
> > > text, approx 250k characters. I only have about 2,000 records but each
> > one
> > > contains a journal publication to search through.
> > >
> > > What I noticed is that some records didn't return a highlight even
> though
> > > they matched on the content. I noticed the hl.maxAnalyzedChars
> parameter
> > > and increased that, but  it allowed some records to be highlighted, but
> > not
> > > all, and then it caused memory problems on the server.  Performance is
> > also
> > > very poor.
> > >
> >
> > I've been thinking hl.maxAnalyzedChars should maybe default to no limit
> --
> > it's a performance threshold but perhaps better to opt-in to such a limit
> > then scratch your head for a long time wondering why a search result
> isn't
> > showing highlights.
> >
> >
> > > To try to fix this I've tried  to configure the unified highlighter in
> my
> > > solrconfig.xml instead.   It seems to be working but again I'm missing
> > some
> > > highlighted records.
> > >
> >
> > There is no configuration of that highlighter in solrconfig.xml; it's
> > entirely parameter driven (runtime).
> >
> >
> > > The other thing is I've tried to adjust my unified highlighting
> settings
> > in
> > > solrconfig.xml and they don't  seem to be having any effect even after
> > > restarting Solr.  I was just wondering whether there is any
> highlighting
> > > information stored at index time. It's taking over 4hours to index my
> > > records so it's not easy to keep reindexing my content.
> > >
> > > Any ideas on how to handle highlighting of large content  would be
> > > appreciated.
> > >
> > > Shaun
> > >
> >
> > Please read the documentation here thoroughly:
> >
> >
> https://lucene.apache.org/solr/guide/8_6/highlighting.html#the-unified-highlighter
> > (or earlier version as applicable)
> > Since you have large bodies of text to highlight, you would strongly
> > benefit from putting offsets into the search index (and re-index) --
> > storeOffsetsWithPositions.  That's an option on the field/fieldType in
> your
> > schema; it may not be obvious reading the docs.  You have to opt-in to
> > that; Solr doesn't normally store any info in the index for highlighting.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
>

Reply via email to