Re: Question on the Sandbox Highlighter

Erik Hatcher Tue, 05 Jul 2005 14:23:51 -0700


On Jul 5, 2005, at 4:58 PM, Terence Lai wrote:

I am currently using Lucene 1.4.2 with the highighter downloadedfrom Lucene In Action.
The Highlighter class provides the following method to highlightthe terms specified in the Query:
/**
* Highlights chosen terms in a text, extracting the most relevantsection.
 * The document text is analysed in chunks to record hit statistics
* across the document. After accumulating stats, the fragment withthe highest score
 * is returned
 *
* @param tokenStream a stream of tokens identified in the textparameter, including offset information.
 * This is typically produced by an analyzer re-parsing a document's
* text. Some work may be done on retrieving TokenStreams moreefficently* by adding support for storing original text position data in theLucene* index but this support is not currently available (as of Lucene1.4 rc2).
 * @param text text to highlight terms in
 *
 * @return highlighted text fragment or null if no terms found
 */
public final String getBestFragment(TokenStream tokenStream, Stringtext)
       throws IOException;
According to the javadoc, this method only returns the mostrelevant section of the text. Is there any way or method to returnENTIRED text with the terms being highlighted?

Yes - it relies on the Fragmenter. For lucenebook.com, for example,if a search result is for a blog entry, the entire contents arehighlighted using a NullFragmenter:


package lia.web;

import org.apache.lucene.search.highlight.Fragmenter;
import org.apache.lucene.analysis.Token;

public class NullFragmenter implements Fragmenter {
  public void start(String s) {
  }

  public boolean isNewFragment(Token token) {
    return false;
  }
}


    Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Question on the Sandbox Highlighter

Reply via email to