[
https://issues.apache.org/jira/browse/SOLR-13367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858556#comment-16858556
]
Jan Høydahl commented on SOLR-13367:
------------------------------------
Moved large section from description to this comment:
***************************************************************************
First, a correctly working example of a range query using Solr v5.1.0 which
produces useful results:
{code:javascript}
{
"responseHeader": {
"status": 0,
"QTime": 366,
"params":
{ "q": "MyStringField:[A TO B}
",
"hl": "true",
"indent": "true",
"hl.preserveMulti": "true",
"fl": "MyStringField,MyUniqueID",
"hl.requireFieldMatch": "true",
"hl.usePhraseHighlighter": "true",
"hl.fl": "MyStringField",
"wt": "json",
"hl.highlightMultiTerm": "true",
"_": "1553275722025"
}
},
"response": {
"numFound": 999,
"start": 0,
"docs": [
{ "MyStringField": [ "Stanley, Wendell M.", "Avery, Roy" ], "MyUniqueID":
"UniqueID1" }
,
{ "MyStringField": [ "Avery, Roy" ], "MyUniqueID": "UniqueID2" }
,
*
**
*** lots more docs correctly found
]
},
*** we get to the highlighting portion of the response
*** this indicates which values of each MyStringField
*** that actually matched the query
"highlighting": {
"UniqueID1":
{ "MyStringField": [ "<em>Avery, Roy</em>" ] }
,
"UniqueID2":
{ "MyStringField": [ "<em>Avery, Roy</em>" ] }
,
"UniqueID3":
{ "MyStringField": [ "<em>American Institute of Biological Sciences</em>",
"<em>Albritton, Errett C.</em>" ] }
,
... etc.
*
**
*** lots more useful highlight values. Note the two matching values
*** for document UniqueID3.
}
{code}
***************************************************************************
* THE PROBLEM
* Now using newer versions of Solr
***************************************************************************
Using the exact same parameters with Solr v7.5.0 or v7.7.1, the top portion of
the
response is basically the same including the number of documents found
{code:javascript}
{
"responseHeader":{
"status":0,
"QTime":245,
"params":
{ "q":"MyStringField:[A TO B}
",
"hl":"on",
"hl.preserveMulti":"true",
"fl":"MyUniqueID, MyStringField",
"hl.requireFieldMatch":"true",
"hl.fl":"MyStringField",
"hightlightMultiTerm":"true",
"wt":"json",
"_":"1553105129887",
"usePhraseHighLighter":"true"}},
"response":{"numFound":999,"start":0,"docs":[
*
**
*** The problem is with the lighlighting portion of the results, which is
effectively empty.
*** There is no way to know what values in each document that actually matched
the query:
"highlighting":{
"UniqueID1":{},
"UniqueID2":{},
"UniqueID3":{},
... etc.
{code}
*
**
*** NOTE: The source data is the same for all of the tested Solr versions and
the Solr indexes
*** were properly rebuilt for each Solr version.
***************************************************************************
Changing the request to using the "unified" highlighter: "hl.method=unified",
the highlighting looks like:
{code:javascript}
"highlighting":{
"UniqueID1":
{ "MyStringField":[]}
,
"UniqueID2":
{ "MyStringField":[]}
,
"UniqueID3":
{ "MyStringField":[]}
{code}
,
... etc.
*
**
*** The highlighting now properly lists the matching field but still no useful
values are listed.
***************************************************************************
NOTE: if I change the query from using a Range clause to using a Wildcard
query: q="MyStringField:A*"
the highlighting is correct in both Solr v7.5.0 and v7.7.1: These are GOOD
results!
{code:javascript}
"highlighting":{
"UniqueID1":
{ "MyStringField": ["<em>Avery, Roy</em>"]}
,
"UniqueID2":
{ "MyStringField": ["<em>Avery, Roy</em>"]}
,
"UniqueID3":
{ "MyStringField": [ "<em>American Institute of Biological Sciences</em>",
"<em>Albritton, Errett C.</em>" ] }
{code}
,
... etc.
*
**
*** This makes me think there is some problem with the way a Range query
*** feeds the search results to the Solr Highlighter code.
***************************************************************************
All attempts to vary the hl specs or any other query parameters do not solve
the problem.
The wildcard query is my current work around but there still is a problem with
range queries:
In summary, there is some incompatibility among:
1) A multi-valued string field AND
2) A range query against that field AND
3) The result Highlighting. It is effectively empty.
I don't know when this issue was first introduced. I have recently been
updating from 5.1.0
to 7.5.0 in one big leap. I have attempted to read through the change logs for
the intervening
versions but I gave up to save my sanity.
You should be able to reproduce this issue using any multi-valued, indexed and
stored string field.
> Highlighting fails for Range queries on Multi-valued String fields
> ------------------------------------------------------------------
>
> Key: SOLR-13367
> URL: https://issues.apache.org/jira/browse/SOLR-13367
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: highlighter
> Affects Versions: 7.5, 7.7.1
> Environment: RedHat Linux v7
> Java 1.8.0_201
> Reporter: Karl Wolf
> Priority: Major
> Fix For: 5.1
>
>
> Range queries against multi-valued string fields produces useless
> highlighting, even though "hl.highlightMultiTerm":"true"
> I have uncovered what I believe is a bug. At the very lease it is a
> difference in behavior between Solr v5.1.0 and v7.5.0 (and v7.7.1).
> I have a multi-valued string Field defined in my schema as:
> <fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
> <field name="MyStringField" type="string" indexed="true" stored="true"
> multiValued="true" />
> I am using a query containing a Range clause and I am using highlighting to
> get the list of values that actually matched the range query.
> All examples below were using the appropriate Solr Admin Server SolrCore
> Query page.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]