Use JTidy - http://sourceforge.net/projects/jtidy/
Thanks, dims --- "Jaquiss, Robert" <[EMAIL PROTECTED]> wrote: > Hello: > > I have just joined this list, and am also a beginning Java > programmer. I appologize if this is not a suitable question for this > list. I need to write a filter for HTML pages. My goal is to read an > HTML page, throwing away all the HTML code and just keeping a block of > text that occurs near the bottom of the page. The HTML tags are liable > to be unbalanced. There will be a <P> but no </P>. I found a sample > program that used the SAXparser, but it SAXparser doesn't seem to handle > unbalanced tags. Ideas/comments would be appreciated. Thank you. > > Regards > Robert Jaquiss > > ===== Davanum Srinivas - http://jguru.com/dims/ __________________________________________________ Do You Yahoo!? Find the one for you at Yahoo! Personals http://personals.yahoo.com --------------------------------------------------------------------- In case of troubles, e-mail: [EMAIL PROTECTED] To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]