Kelvin Tan wrote:
My suggestion would be to modify HTMLParser to do the job. Don't think it's very difficult. I'm unaware of any existing HTML Parsers which support that functionality...Maybe Erik wants to include an "improved" version of my code snippet into CVS.
I guess I am not the only one wanting to exclude certain parts from an HTML page ;-)
All the best
Michael
Regards,
Kelvin
--------
The book giving manifesto - http://how.to/sharethisbook
On Thu, 30 Jan 2003 10:56:50 +0100, Michael Wechner said:
Hi
I am looking for an HTMLParser which skips text tagged by
<no-index> or something similar. This way I could exclude for
instance a "global navigation section" within the HTML
<no-index> International<br> Business<br> Science<br> ...
</no-index>
It seems that the current demo/HTMLParser
(http://lucene.sourceforge.net/cgi-
bin/faq/faqmanager.cgi?file=chapter.indexing&toc=faq#q11) is not
capable of doing something like that.
Any pointers are very welcome.
Thanks a lot
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
