: My objective is to be able to stored it as xhtml in the field and be : able to retrieve it as cached output. Since tika is already giving xhtml : output, I wonder why when Solr save it as a plain text. (Maybe I missed : out something in the configuration??)
I'm not very familiar with Tika or Solr CELL, but I think what you are seeing is that Solr only asks Tika for the *content* of the DOM Nodes matched by the xpath and/or capture params (ie: node.getTextContent()). I suspect it wouldnt' be too hard to add an option to allow the capture of the serialized DOM Nodes. -Hoss