Hi Maryam,
You can index the content of specific field as UN_TOKENIZED and then you can
do phrase search on that field..
It will search for only phrases not tokens...
To index HTML pages you can use any HTML parser...
this may be useful to you..
http://lucene.apache.org/java/docs/api/org/apache/lucene/demo/html/HTMLParser.html
Thanks.
Bhavin pandya
----- Original Message -----
From: "Maryam" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Thursday, March 15, 2007 7:55 AM
Subject: Indexing HTML pages and phrases
Hi,
I am wondering if we can index a phrase (not term) in
Lucene? Also, I am not usre if it can index HTML
pages? I need to have access to the text of some of
tags, I am not sure if this can be done in Lucene. I
would be so glad if you help me in this case.
Thanks
____________________________________________________________________________________
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_tools.html
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]