On 3-Dec-07, at 10:58 AM, Owens, Martin wrote:



You can tell lucene to store token offsets using TermVectors
(configurable via schema.xml).  Then you can customize the request
handler to return the token offsets (and/or positions) by retrieving
the TVs.

I think that is the best plan of action, how do I create a custom request handler that will use the existing indexed fields? There will be 2 requests as I see it, 1 for the search and 1 to retrieve the offsets when you view one of those found items. Any advice you can give me will be much appricated as I've had no luck with google so far.

First, you need to store token offets for the field:
See http://wiki.apache.org/solr/SchemaXml , "Expert field options". You definitely want termVectors=true, termOffsets=true.

You do not necessarily need two requests; instead, you can override or modify the request handler you are using (StandardRequestHandler, DisMaxREquestHandler) to return the information. You'll have to process the Query to extract the terms (like HighlighingUtils does), then get the TermVector token offset data for each matching doc and look for the terms in the Query. I haven't worked with Term Vectors (a Lucene API), so I'm not sure exactly how to go about this.

-Mike

Reply via email to