Mushfiqur Rahman wrote:
I want to parse a HTML document( may not be a XHTML document) using
org.apache.xerces.parsers.DOMParser and get a org.w3c.dom.Document after
parsing. Can anyone tell me how can I do it?
If you just need a DOM document, there are a
few options. Check out JTidy[1] and NekoHT
Hello sir,
I want to parse a HTML document( may not be a XHTML document) using org.apache.xerces.parsers.DOMParser and get a org.w3c.dom.Document after parsing. Can anyone tell me how can I do it?
Note: The HTML page may have tags like: "", with no ending tag( that means it may be nonconformin xml