Owens, Martin wrote:
Hello everyone,
We're working to replace the old Linux version of dtSearch with Lucene/Solr,
using the http requests for our perl side and java for the indexing.
The functionality that is causing the most problems is the highlighting since
we're not storing the text in solr (only indexing) and we need to highlight an
image file (ocr) so what we really need is to request from solr the word
indexes of the matches, we then tie this up to the ocr image and create html
boxes to do the highlighting.
Sorry this hasn't had a response....
I'm not totally following what you are trying to do. If I understand
it, you want to use solr to get back the matching highlighting areas
from an arbitrary bit of text that is not stored in the index?
Off hand, there is nothing out of the box to do this (mike?)
My guess is you will have to write a custom requestHandler that pulls
the stored text from wherever you store it, then pass it to a custom
Formatter that includes the offsets in the response.
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/highlighter/src/java/org/apache/lucene/search/highlight/Formatter.java
In 1.3-dev (/trunk) you can register a custom Formatter in solrconfig.xml
ryan