Hi, A common way to provide in-document hit hightlighting is to store a HTML representation of the source document, and use that as basis for hit highlighting. Google cache is a well known implementation.
My question is whether Tika HTML has a "wysiwyg html output" mode, where the focus is to produce good looking html? I realize this also depends on each parser. But have anyone tried to use Tika HTML for in-document highlighting? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com
