Erick Erickson created SOLR-12136:
-------------------------------------
Summary: Document hl.q parameter
Key: SOLR-12136
URL: https://issues.apache.org/jira/browse/SOLR-12136
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Components: documentation
Reporter: Erick Erickson
Assignee: Erick Erickson
*********Original issue:
If I specify:
hl.fl=f1&hl.q=something
then "something" is analyzed against the default field rather than f1
So in this particular case, f1 did some diacritic folding
(GermanNormalizationFilterFactory specifically). But my guess is that
the df was still "text", or at least something that didn't reference
that filter.
I'm defining "worked" in what follows is getting highlighting on "Kündigung"
so
Kündigung was indexed as Kundigung
So far so good. Now if I try to highlight on f1
These work
q=f1:Kündigung&hl.fl=f1
q=f1:Kündigung&hl.fl=f1&hl.q=Kundigung <= NOTE, without umlaut
q=f1:Kündigung&hl.fl=f1&hl.q=f1:Kündigung <= NOTE, with umlaut
This does not work
q=f1:Kündigung&hl.fl=f1&hl.q=Kündigung <= NOTE, with umlaut
Testing this locally, I'd get the highlighting if I defined df as "f1"
in all the above cases.
**********David Smiley's analysis
BTW hl.q is parsed by the hl.qparser param which defaults to the defType param
which defaults to "lucene".
In common cases, I think this is a non-issue. One common case is
defType=edismax and you specify a list of fields in 'qf' (thus your query has
parts parsed on various fields) and then you set hl.fl to some subset of those
fields. This will use the correct analysis.
You make a compelling point in terms of what a user might expect -- my gut
reaction aligned with your expectation and I thought maybe we should change
this. But it's not as easy at it seems at first blush, and there are bad
performance implications. How do you *generically* tell an arbitrary query
parser which field it should parse the string with? We have no such standard.
And lets say we did; then we'd have to re-parse the query string for each field
in hl.fl (and consider hl.fl might be a wildcard!). Perhaps both solveable or
constrainable with yet more parameters, but I'm pessimistic it'll be a better
outcome.
The documentation ought to clarify this matter. Probably in hl.fl to say that
the fields listed are analyzed with that of their field type, and that it ought
to be "compatible" (the same or similar) to that which parsed the query.
Perhaps, like spellcheck's spellcheck.collateParam.* param prefix, highlighting
could add a means to specify additional parameters for hl.q to be parsed (not
just the choice of query parsers). This isn't particularly pressing though
since this can easily be added to the front of hl.q like hl.q={!edismax
qf=$hl.fl v=$q}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]