You don't necessarily need to store the data in Lucene, but yes it does need to be stored somewhere. Otherwise, where would the context come from? If you are not stripping stopwords or stemming or lowercasing or anything, I suppose you could rebuild it from the index...

To keep from having to retokenize you can check out the TokenSources class which allows you to use TermVectors to rebuild the TokenStream....you still need the original text to fragment and highlight though. Weather you pull that text from a database, the filesystem, or Lucene, does not matter to the highlighter.

- Mark

DURGA DEEP wrote:
I have a follow up question. Seems like if I want to use highlighting, we
should store the content of the entire document that has to be indexed.

         d.add( new Field( FIELD_NAME, "some text", Field.Store.YES,
Field.Index.TOKENIZED) );

Are there better ways of acheiving this ?. Since we have huge data that
needs to be indexed.

  Thanks Much
_ddt

On 1/29/08, Mark Miller <[EMAIL PROTECTED]> wrote:
Look at the Highlighter in contrib. It creates fragments (context) and
highlights search terms in them (keywords).

If you want to highlight Phrase's correctly, check out this issue which
adds support for Spans and PhraseQuerys:

https://issues.apache.org/jira/browse/LUCENE-794

Mark


DURGA DEEP wrote:
Dear All,

         I've been scouring through the Lucene classes. Are there any
classes which can help me acheive the following ?.

         1)  We are an e-mail service provider. We wanted to provide a
seach
capability of e-mail messages via Lucene. So far we are able to index/
parse
the e-mail. create the appopriate indexes etc..

              Now The customer wants us to have a google like search
capability i.e when they search for a particular word, the word should
be
highlighted as well as the surrounding
              text i.e the context in which this word occurs should also
be
shown.

              Example : when searching for the word thread.

              ...crawler is a classic example of Thread in an
poolExecutor
code
               an poolExecutor code...

Any help greatly appreciated
+ddt


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to