The Highlighter in the lucene "contrib" section has a class called TokenSources which tries to find the best way of getting a TokenStream. It can build a TokenStream from either: a) an Analyzer b) TermPositionVector (if the field was created with one in the index)
You may find that using TermPositionVectors in your index gives you a speed up but it all depends on the cost of processing done by your analyzer. Using TermPositionVector incurs extra data reads to get the list of tokens from disk whereas using Analyzer is extra CPU load processing the document text you've already read from disk. Both approaches typically need to read the original document text when highlighting in order to retain the stop words that make it readable. I have noticed before now that the StandardAnalyzer was quite slow but other Analyzers are much quicker so it can really depend on your choice. Cheers Mark ___________________________________________________________ To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]