Improving Readability of Hit Highlighting

Terence Gannon Mon, 12 Jan 2009 08:01:01 -0800

I'm indexing text from an OCR of an old document.  Many words get read
perfectly, but they're typically embedded in a lot of junk.  I would
like the hit highlighting to show only the 'good' words, in the order
in which they appeared in the original document.  Is it possible to
use output of the filter classes as the text used in hit highlighting?
 Or do you have to all the text cleanup outside of Solr and present it
with two fields to index, one with the original text, and one with the
cleaned up text.  The objective of the hit highlighting is to give the
user a *sense* of the original context, even if it's not provided
verbatim from the original document.  Thanks in advance.


TerryG

Improving Readability of Hit Highlighting

Reply via email to