[ http://issues.apache.org/jira/browse/SOLR-57?page=all ]

Yonik Seeley resolved SOLR-57.
------------------------------

    Resolution: Duplicate

known issue.
It probably wouldn't be too hard to fix for Whitespace*, but could be pretty 
difficult for Standard*

> Highlighter does not work with HTML content that's passed through 
> HTMLStrip*Tokenizer
> -------------------------------------------------------------------------------------
>
>                 Key: SOLR-57
>                 URL: http://issues.apache.org/jira/browse/SOLR-57
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>         Environment: Red Hat Linux 9, Tomcat 5.5.20
>            Reporter: Ho Yin Au
>            Priority: Minor
>
> I have a fieldtype with the following definition:
>         <fieldtype name="htmltext"  class="solr.TextField" 
> positionIncrementGap="100">
>             <analyzer>
>                 <tokenizer class="solr.HTMLStripStandardTokenizerFactory"/>
>                 <filter class="solr.StandardFilterFactory" />
>                 <filter class="solr.LowerCaseFilterFactory" />
>                 <filter class="solr.StopFilterFactory" />
>                 <filter class="solr.EnglishPorterFilterFactory" />
>                 <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
>                 <filter class="solr.ISOLatin1AccentFilterFactory" />
>             </analyzer>
>         </fieldtype>
> When fields with that definition are included in the list of fields to be 
> highlighted, the highlighted term is always offset because it does not take 
> into account the HTML tags before it, so you end up with something like this 
> for the highlighted snipplet:
> Does your comptuer meet the <a 
> href="http:/<em>/www.example</em>.com/system_requirements.shtml">minimum 
> system requirements</a>?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to