Re: Indexing HTML pages and phrases

2007-03-16 Thread Doron Cohen
For search phrases there's no need to detect the phrases at indexing time - the position of each word is saved in the index and then used at search time to match phrase queries. (also see 'query syntax document'.) Lucene takes plain text as document input - extraction of content text and

Indexing HTML pages and phrases

2007-03-14 Thread Maryam
Hi, I am wondering if we can index a phrase (not term) in Lucene? Also, I am not usre if it can index HTML pages? I need to have access to the text of some of tags, I am not sure if this can be done in Lucene. I would be so glad if you help me in this case. Thanks

Re: Indexing HTML pages and phrases

2007-03-14 Thread Bhavin Pandya
- Original Message - From: Maryam [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, March 15, 2007 7:55 AM Subject: Indexing HTML pages and phrases Hi, I am wondering if we can index a phrase (not term) in Lucene? Also, I am not usre if it can index HTML pages? I

Re: Indexing HTML pages and phrases

2007-03-14 Thread Bhavin Pandya
/apache/lucene/demo/html/HTMLParser.html Thanks. Bhavin pandya - Original Message - From: Maryam [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, March 15, 2007 7:55 AM Subject: Indexing HTML pages and phrases Hi, I am wondering if we can index a phrase (not term