On Saturday 10 Dec 2005 00:17, Erik Hatcher wrote: > > When I wrote the Analyzer for my documents, I produced the > > tokenstream to > > generate Token objects with the start end end positions of each > > term in them > > > > Now, from my Hits object I can find each document I need to output, > > but how do > > I get back to the Tokens I originally produced. > > Are you using Lucene 1.4.3? Or the latest Subversion version? 1.4.3
> > The Lucene index does not keep all of the information in the Token's > emitted by the analyzer (unless specified to do so, but 1.4.3 didn't > support the fancier features). > > So, the fail-safe way is to re-tokenize the original text (perhaps > stored in the Lucene index) and hand that TokenStream to the > Highlighter. I found a highlighter in the sandbox - I think that is doing something like this, so am going to experiment with that. Initial attempt failed to compile because I think it was assuming a later version of the lucene code, but it looks like I just cut out the offending class and its ok. -- Alan Chandler http://www.chandlerfamily.org.uk Open Source. It's the difference between trust and antitrust. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]