Here's my PR, which includes some edits to the ref guide docs where I tried to clarify these settings a little too. https://github.com/apache/lucene-solr/pull/1651 ~ David
On Sat, Jul 4, 2020 at 8:44 AM Nándor Mátravölgyi <nandor.ma...@gmail.com> wrote: > I guess that's fair. Let's have hl.fragsizeIsMinimum=true as default. > > On 7/4/20, David Smiley <david.w.smi...@gmail.com> wrote: > > I doubt that WORD mode is impacted much by hl.fragsizeIsMinimum in terms > of > > quality of the highlight since there are vastly more breaks to pick from. > > I think that setting is more useful in SENTENCE mode if you can stand the > > perf hit. If you agree, then why not just let this one default to > "true"? > > > > We agree on better documenting the perf trade-off. > > > > Thanks again for working on these settings, BTW. > > > > ~ David > > > > > > On Fri, Jul 3, 2020 at 1:25 PM Nándor Mátravölgyi < > nandor.ma...@gmail.com> > > wrote: > > > >> Since the issue seems to be affecting the highlighter differently > >> based on which mode it is using, having different defaults for the > >> modes could be explored. > >> > >> WORD may have the new defaults as it has little effect on performance > >> and it creates nicer highlights. > >> SENTENCE should have the defaults that produce reasonable performance. > >> The docs could document this while also mentioning that the UH's > >> performance is highly dependent on the underlying Java String/Text? > >> Iterator. > >> > >> One can argue that having different defaults based on mode is > >> confusing. In this case I think the defaults should be changed to have > >> the SENTENCE mode perform better. Maybe the options for nice > >> highlights with WORD mode could be put into the docs in this case as > >> some form of an example. > >> > >> As long as I can use the UH with nicely aligned snippets in WORD mode > >> I'm fine with any defaults. I explicitly set them in the config and in > >> the queries most of the time anyways. > >> > > >