Yonik Seeley wrote:
Highlighter stuff:
- allow specification of markup
- allow fragsize per-field
- keep in mind recent highlighter work going on in Lucene... we
should try and specify what instead of how (not use exact class names,
etc)
- start using "hl" namespace for highlighter params... this is just a
convention to help clarify the semantics of a parameter at a glance.
  - for consistency, should "highlight" => "hl", "highlightFields" =>
"hl.fields" or "hl.fl", "maxSnippets" => "hl.snippets"?
   Normally backward compatibility is very important for the external
interfaces, *but* things will change while a feature is in
development... every commit does not constitute a release. Is
highlighting new enough that we can change these parameters? Is anyone
using these parameters in production where it would be a burden if we
changed these?


I think the introduction of a bunch of new highlighter parameters is as good a time as any to break backwards compatibility, even if we're keeping the same default behaviour - but we're not using Solr in production yet, so it's easy for me to say that.

Examples of potential highlighter param names:
hl=true
hl.fl=name,title,body
hl.snippets=4
hl.fragsize=100
hl.formatter=simple
hl.simple.pre=<em>
hl.simple.post=</em>

And per field params:
f.title.hl.fragsize=0 // overrides fragsize only for field 'title'

These all sound good.

I was going to try and rewrite my highlighting patch to use the new facilities. I can also try and update SolrPluginUtils at the same time, as I'll need to make changes to that as well.

I'm hoping that the lucene highlighter can be updated soon to pick up the fix for http://issues.apache.org/jira/browse/LUCENE-645 (although it sounds like there might be further changes ahead). I've currently got a bunch of code that tries to handle this wandering punctuation in the highlights.

-Andrew

Reply via email to