On 7/21/06, Andrew May (JIRA) <[EMAIL PROTECTED]> wrote:
As discussed in the mailing list, I've been looking at adding additional
configuration options for highlighting.
I've made quite a few changes to the properties for highlighting:
Cool -- this is definately useful.
Properties that can be set on request, or in solrconfig.xml at the top level:
highlight (true/false)
highlightFields
Properties that can be set in solrconfig.xml at the top level or per-field
formatter (simple/gradient)
formatterPre (preTag for simple formatter)
formatterPost (postTag for simple formatter)
formatterMinFgCl (min foreground colour for gradient formatter)
formatterMaxFgCl (max foreground colour for gradient formatter)
formatterMinBgCl (min background colour for gradient formatter)
formatterMaxBgCl (max background colour for gradient formatter)
fragsize (if <=0 use NullFragmenter, otherwise use GapFragmenter with this
value)
I agree with Yonik that the number of request handler parameters has
gotten a bit too large for a flat namespace (the tipping point was
probably the introduction of three highlighting parameters--so you can
blame me <g>).
I've created HighlightingUtils to handle most of the parameter parsing, but the
hightlighting is still done in SolrPluginUtils and the doStandardHighlighting()
method still has the same signature, but the other highlighting methods have
had to be changed (because highlighters are now created per highlighted field).
Highlighting has definately grown to the point where it should be in
its own class, and I could see the justification for giving it its own
package, given there there are now at least three supporting classes
for Solr, and this could easily grow.
I'm not particularly happy with the code to pull parameters from CommonParams,
first checking the field then falling back, e.g.:
String pre = (params.fields.containsKey(fieldName) &&
params.fields.get(fieldName).formatterPre != null) ?
params.fields.get(fieldName).formatterPre :
params.formatterPre != null ? params.formatterPre : "<em>";
Parameter handling is a little clunky as it is, as CommonParams was
added after general parameter handling, so it wasn't as tightly
integrated as it should be.
Currently:
(static) CommonParams contains parameter defaults
instance of CommonParams contains the defaults for a given request handler
instance of SolrQueryRequest contains passed-in query parameters
(static) method of SolrPluginUtils can be used to retrieve
parameters, as follows:
SolrPluginUtils.getParam(req,"param", commomParams.<defaultValue>)
This isn't ideal, and as you've demonstrated, an absolute disaster
once you try to tack on per-field parameters. There could be some
sort of CommonParams-like class which parses the default parameters,
like it does currently (but adding logic for per-field overrides), but
having an addition method which accepts a SolrQueryRequest and adds
any per-query overrides, finally outputting an instance of some
Parameter class which can be queried to determine the global and
per-field settings in one central location.
Designing and implementing this carefully will be time-consuming but
important, I think. Without it, it is more difficult to accept useful
contributions like yours.
I've removed support for a custom formatter - just choosing between
simple/gradient. Probably that's a bad decision, but I wanted an easy way to
choose between the standard formatters without having to invent a generic way
of supplying arguments for the constructor. Perhaps there should be
formatterType=simple/gradient and formatterClass=... which overrides
formatterType if set at a lower level - with the formatterClass having to have
a zero-args constructor? Note: gradient is actually SpanGradientFormatter.
It isn't worth supporting formatterClass without making it more
general. I'd like to see an extension which allows the specification
of a class and an array of constructor args.
ex.
<class name="formatter"
classname="org.apache.lucene.search.highlighting.SimpleHTMLFormatter"
/>
<class name="formatter"
classname="org.apache.lucene.search.highlighting.SimpleHTMLFormatter">
<arr>
<str><strong></str>
<str>&tl;/string></str>
<arr>
</class>
I'm not sure I properly understand how Fragmenters work, so supplying fragsize to
GapFragmenter where >0 (instead of what was a default of 50) may not make sense.
Passing to the constructor makes sense, as it sets the fragmentSize of
the base SimpleFragmenter class (note the default is 100; 50 is the
position gap which indicates when to start a new fragment).
Hopefully I'll have time in a few days to look at the patch and give
you some feedback.
Thanks,
-Mike