Hi
I am looking for an HTMLParser which skips text tagged by
<no-index> or something similar. This way I could exclude for
instance a "global navigation section" within the HTML
<no-index>
International<br>
Business<br>
Science<br>
...
</no-index>
It seems that the current demo/HTMLParser
(http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi?file=chapter.indexing&toc=faq#q11)
is not capable of doing something like that.
Any pointers are very welcome.
Thanks a lot
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
- Re: <no-index> or <index> Michael Wechner
- Re: <no-index> or <index> Kelvin Tan
- Re: <no-index> or <index> Michael Wechner
- Re: <no-index> or <index> Erik Hatcher
- Re: <no-index> or <index> Michael Wechner
- Re: <no-index> or <index> Erik Hatcher
- Re: <no-index> or <index> Michael Wechner
- Re: <no-index> or <index> Erik Hatcher
- Re: <no-index> or <index> Michael Wechner
- Re: <no-index> or <index> Ronnie Kolehmainen
- RE: <no-index> or <index> Ronnie Kolehmainen
