Re: Confused again ... Getting at results

Alan Chandler Fri, 09 Dec 2005 22:38:48 -0800

On Saturday 10 Dec 2005 00:17, Erik Hatcher wrote:

> > When I wrote the Analyzer for my documents, I produced the
> > tokenstream  to
> > generate Token objects with the start end end positions of each
> > term in them
> >
> > Now, from my Hits object I can find each document I need to output,
> > but how do
> > I get back to the Tokens I originally produced.
>
> Are you using Lucene 1.4.3?  Or the latest Subversion version?
1.4.3



>
> The Lucene index does not keep all of the information in the Token's
> emitted by the analyzer (unless specified to do so, but 1.4.3 didn't
> support the fancier features).
>
> So, the fail-safe way is to re-tokenize the original text (perhaps
> stored in the Lucene index) and hand that TokenStream to the
> Highlighter.

I found a highlighter in the sandbox - I think that is doing something like 
this, so am going to experiment with that.

Initial attempt failed to compile because I think it was assuming a later 
version of the lucene code, but it looks like I just cut out the offending 
class and its ok.


-- 
Alan Chandler
http://www.chandlerfamily.org.uk
Open Source. It's the difference between trust and antitrust.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Confused again ... Getting at results

Reply via email to