Hi,

A common way to provide in-document hit hightlighting is to store a HTML 
representation of the source document, and use that as basis for hit 
highlighting. Google cache is a well known implementation.

My question is whether Tika HTML has a "wysiwyg html output" mode, where the 
focus is to produce good looking html? I realize this also depends on each 
parser. But have anyone tried to use Tika HTML for in-document highlighting?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

Reply via email to