Hello! You don't need a custom update request processor - there is a char filter dedicated to strip HTML tags from your content and index only relevant parts of it - http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory
However, you first need to properly send it to Solr for indexing. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > I think you will have to write an UpdateProcessor to strip out html tags. > http://wiki.apache.org/solr/UpdateRequestProcessor > As per Solr 4.0 you can also use scripting languages like Python, Ruby and > Javascript to write scripts for use as updateprocessors too. > -----Mensagem Original----- > From: Pratyul Kapoor > Sent: Friday, October 26, 2012 3:56 AM > To: solr-user@lucene.apache.org > Subject: Filtering HTML content in Solr 4.0.0 > Hi, > I am using Solr 4.0.0. I have a HTML content as description of a product. > If I index it without any filtering it is giving errors on search. > How can I filter an HTML content. > Pratyul