Re: or

Michael Wechner Thu, 30 Jan 2003 16:08:39 -0800

Kelvin Tan wrote:

My suggestion would be to modify HTMLParser to do the job. Don't think it's very difficult. I'm unaware of any existing HTML Parsers which support that functionality...

Maybe Erik wants to include an "improved" version of my code snippet into CVS.

I guess I am not the only one wanting to exclude certain parts from an HTML page ;-)

All the best

Michael

Regards,
Kelvin

--------
The book giving manifesto - http://how.to/sharethisbook

On Thu, 30 Jan 2003 10:56:50 +0100, Michael Wechner said:

Hi

I am looking for an HTMLParser which skips text tagged by

<no-index> or something similar. This way I could exclude for
instance a "global navigation section" within the HTML

<no-index> International<br> Business<br> Science<br> ...
</no-index>

It seems that the current demo/HTMLParser
(http://lucene.sourceforge.net/cgi-
bin/faq/faqmanager.cgi?file=chapter.indexing&toc=faq#q11) is not
capable of doing something like that.

Any pointers are very welcome.

Thanks a lot

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: or

Reply via email to