Apologies, in my reply I incorrectly stated that one would need to account for analyzer behaviors, which is completely untrue.
To clarify, at indexing time the offset information can be stored with little effort - see Field.TermVector.WITH_OFFSETS and SegmentTermPositionVector. This should be faster and less involved than re-analyzing the original content at "highlight" time, with the trade-off being a larger index and a slight increase in indexing time. -----Original Message----- From: Franklin Simmons [mailto:[email protected]] Sent: Wednesday, August 12, 2009 3:17 PM To: [email protected] Subject: RE: get text pointer from hit or possibly highlighter Andrew, If you have control over indexing, you might accomplish this with TermPositionVector information, however, bear in mind that analyzers often discard text, e.g. the StandardAnalyzer doesn't index the word 'the', which you would have to account for. -----Original Message----- From: Andrew Schuler [mailto:[email protected]] Sent: Wednesday, August 12, 2009 12:45 PM To: [email protected] Subject: get text pointer from hit or possibly highlighter I've been doing some research trying to find out about getting a text position pointer for hits and this list is my last hope. If I have a (rather long) text document indexed and I get a hit on said document but the search term shows up near the end of the doc it would be nice to be able to know the position of the hit inside the doc itself. In .NET I'm thinking of something like a TextPointer. Does anyone know of a clever way to do this with Lucene.Net?
