Hello:
I have just joined this
list, and am also a beginning Java programmer. I appologize if this is not a
suitable question for this list. I need to write a filter for HTML pages. My
goal is to read an HTML page, throwing away all the HTML code and just keeping a
block of text that occurs near the bottom of the page. The HTML tags are liable
to be unbalanced. There will be a <P> but no </P>. I found a sample
program that used the SAXparser, but it SAXparser doesn't seem to handle
unbalanced tags. Ideas/comments would be appreciated. Thank
you.
Regards
Robert Jaquiss
|
- Re: Looking for tools/ideas for filtering HTML Jaquiss, Robert
- Re: Looking for tools/ideas for filtering HTML Davanum Srinivas
- RE: Looking for tools/ideas for filtering HTML Neeme Praks