Zaccheo Bagnati created SOLR-8537:
-------------------------------------
Summary: phrase highlighter doesn't work when searching for phrase
containing some stopwords
Key: SOLR-8537
URL: https://issues.apache.org/jira/browse/SOLR-8537
Project: Solr
Issue Type: Bug
Components: highlighter
Affects Versions: 4.10.4
Reporter: Zaccheo Bagnati
Priority: Minor
When executing a phrase search containing 3 or more stopwords highlight is
empty.
Example:
{code:xml|title=solrconfig.xml}
<?xml version="1.0" encoding="UTF-8" ?>
<config>
<luceneMatchVersion>LUCENE_4_10</luceneMatchVersion>
<requestHandler name="/admin/"
class="org.apache.solr.handler.admin.AdminHandlers" />
<requestHandler name="/select" class="solr.SearchHandler" />
<requestHandler name="/update" class="solr.UpdateRequestHandler" />
<requestHandler name="/analysis/field"
class="solr.FieldAnalysisRequestHandler" startup="lazy"/>
</config>
{code}
{code:xml|title=schema.xml}
<?xml version="1.0" ?>
<schema name="${solr.core.name}">
<types>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0"
positionIncrementGap="0"/>
<fieldtype name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true"/>
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" format="snowball" enablePositionIncrements="true"/>
</analyzer>
</fieldType>
</types>
<fields>
<field name="_version_" type="long" indexed="true" stored="true" />
<field name="id" type="string" indexed="true" stored="true"
multiValued="false" />
<field name="document_text" type="text" indexed="true" stored="true"
multiValued="false" />
</fields>
<uniqueKey>id</uniqueKey>
<defaultSearchField>document_text</defaultSearchField>
</schema>
{code}
{code:title=stopwords.txt}
c
e
g
{code}
Load this document:
{code:xml}
<add>
<doc>
<field name="id">1</field>
<field name="document_text">a c b d a b c d e f g h i a f g b e</field>
</doc>
</add>
{code}
Execute query:
http://myhost:8983/solr/test_hl/select?q=%22a+b+c+d+e+f+g+h%22&wt=json&indent=true&hl=true&hl.fl=document_text&hl.simple.pre=%3Cem%3E&hl.simple.post=%3C%2Fem%3E
This is the result:
{code:javascript}
{
"responseHeader":{
"status":0,
"QTime":2},
"response":{"numFound":1,"start":0,"docs":[
{
"id":"1",
"document_text":"a c b d a b c d e f g h i a f g b e"}]
},
"highlighting":{
"1":{}}}
{code}
Highlighting for document 1 is empty!
Searching for "a b c d e f g" works correctly
This problem does not affect solr 5.4
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]