What is correct use of HTMLStripCharFilter in Solr 3.1

nicksnels Thu, 12 May 2011 11:44:11 -0700

Hi,

I recently upgraded from Solr 1.3 to Solr 3.1 in order to take advantage of
the HTMLStripCharFilter. But it isn't working as I expected.


I have a text field that may contain HTML tags. I however would like to
store it in Solr without the HTML tags. And retrieve the text field for
display and for highlighting without HTML tags.

I added <charFilter class="solr.HTMLStripCharFilterFactory"/> to the top of
<fieldType name="text" class="solr.TextField" positionIncrementGap="100"
autoGeneratePhraseQueries="true"> in the schema.xml file of the solr
example, both in <analyzer type="index"> and in <analyzer type="query">.

And the text field is simply:

<field name="text" type="text" indexed="true" stored="true"/>

Now, when I do a search. The text field still has all the HTML tags in them
and the highlighting is totally screwed up with em tags around virtually
every word. What am I doing wrong?

Kind regards,

Nick

--
View this message in context: 
http://lucene.472066.n3.nabble.com/What-is-correct-use-of-HTMLStripCharFilter-in-Solr-3-1-tp2933021p2933021.html
Sent from the Solr - User mailing list archive at Nabble.com.

What is correct use of HTMLStripCharFilter in Solr 3.1

Reply via email to