My suggestion would be to modify HTMLParser to do the job. Don't think it's
very difficult. I'm unaware of any existing HTML Parsers which support that
functionality...


Regards,
Kelvin

--------
The book giving manifesto     - http://how.to/sharethisbook


On Thu, 30 Jan 2003 10:56:50 +0100, Michael Wechner said:
>Hi
>
>I am looking for an HTMLParser which skips text tagged by
>
><no-index>  or something similar. This way I could exclude for
>instance a "global navigation section" within the HTML
>
><no-index> International<br> Business<br> Science<br> ...
></no-index>
>
>It seems that the current demo/HTMLParser
>(http://lucene.sourceforge.net/cgi-
>bin/faq/faqmanager.cgi?file=chapter.indexing&toc=faq#q11) is not
>capable of doing something like that.
>
>Any pointers are very welcome.
>
>Thanks a lot
>
>Michael
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to