Hello, Hoping someone might clear up a question for me:
When Tokenizing we provide the start and end character offsets for each token locating it within the source text. If I tokenize the text "word" and then serach for the term "word" in the same field, how can I recover this character offset information in the matching documents to precisely locate the word? I have been storing this character info myself using payload data but if lucene stores it, then I am doing so needlessly. If recovering this character offset info isn't possible, what is this charcter offset info used for? thanks so much, C>T> -- TH!NKMAP Christopher Tignor | Senior Software Architect 155 Spring Street NY, NY 10012 p.212-285-8600 x385 f.212-285-8999