For search phrases there's no need to "detect the phrases" at indexing time
- the position of each "word" is saved in the index and then used at search
time to match phrase queries. (also see 'query syntax document'.)

Lucene takes plain text as document input - extraction of content text and
properties from (say) an HTML should be done external to Lucene. (also see
'Lucene FAQ'.)

Assigning Store.YES to a field added to a document being indexed would save
the text of that field in the index so that it is later (at search time)
fetchable. (also see javadocs for org.apache.lucene.document.Field.)

Regards,
Doron

Maryam <[EMAIL PROTECTED]> wrote on 14/03/2007 19:25:52:

> Hi,
>
> I am wondering if we can index a phrase (not term) in
> Lucene? Also, I am not usre if it can index HTML
> pages? I need to have access to the text of some of
> tags, I am not sure if this can be done in Lucene. I
> would be so glad if you help me in this case.
>
> Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to