Hi Solr Experts,

I am using the HTMLStripCharFilterFactory for removing <html> tags in Body
element.

Body contains data like <html><body>Ipad</body></html>

I made changes in managed schema .

<field name="body" type="html" indexed="true" required="false" stored="true"
/>


<copyField source="body" dest="_text_"/>


---


     <fieldType name="html" stored="true" indexed="true" class=
"solr.TextField">

      <analyzer type="index">

        <charFilter class="solr.HTMLStripCharFilterFactory"/>

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <!-- in this example, we will only use synonyms at query time

                     <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>

        -->

        <filter class="solr.StopFilterFactory" ignoreCase="true" words=
"stopwords.txt"/>

        <filter class="solr.WordDelimiterFilterFactory" generateWordParts=
"1" generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="1"/>

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.KeywordMarkerFilterFactory" protected=
"protwords.txt"/>

        <filter class="solr.PorterStemFilterFactory"/>

        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>

      </analyzer>

      <analyzer type="query">

        <charFilter class="solr.HTMLStripCharFilterFactory"/>

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>

        <filter class="solr.StopFilterFactory" ignoreCase="true" words=
"stopwords.txt"/>

        <filter class="solr.WordDelimiterFilterFactory" generateWordParts=
"1" generateNumberParts="1" catenateWords="0" catenateNumbers="0"
catenateAll="0" splitOnCaseChange="1"/>

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.KeywordMarkerFilterFactory" protected=
"protwords.txt"/>

        <filter class="solr.PorterStemFilterFactory"/>

        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>

      </analyzer>

    </fieldType>


I restarted the Solr and Indexed again.


But When I Query in Solr Admin.. I am still getting the Search results with
Html Tags in it.



"body":"<body>Practically everytime I log onto Mogran, suddenly I see it
running


*Please let me know what will be the Issue…Am I Missing anything.*


Thanks

Fiz..

Reply via email to