Hi - you need an identity mapper for Tika if i remember correctly: <property> <name>tika.htmlmapper.classname</name> <value>org.apache.tika.parser.html.IdentityHtmlMapper</value> <description>Classname of Tika HTMLMapper to use. Influences the elements included in the DOM and hence the behavior of the HTMLParseFilters. </description> </property>
Regards, Markus -----Original message----- > From:Matt Rutherford <[email protected]> > Sent: Monday 8th May 2017 19:45 > To: [email protected] > Subject: Prevent parsers from stripping html tags > > I would like to maintain the html tags during the parsing stage so they > also get indexed. How can I accomplish this? > > I tried removing the parser plugins (html and tika in my case) but it seems > you need at least one and enabling either of these strips the markup from > the docs. >

