In my java code, i am trying to read the web page from a url. The url points to a linux directory containing some xml files. In short, i am trying to read the directory via url, for further processing.
I use the following code : public Document parse(URL url) throws DocumentException { SAXReader reader = new SAXReader(); Document document = reader.read(url); return document; }
The url is initialized to http://storm.ifs.tuwien.ac.at:8080/msgstore/
I get the error "org.dom4j.DocumentException:Error on line 11 of document http://storm.ifs.tuwien.ac.at:8080/msgstore/: The entity "nbsp" was referenced, but not declared. The entity "nbsp" was referenced, but not declared."
Of course "nbsp" is used inside the page source. But the page is automatically generated by apache tomcat upon accessing the url.
How to handle this problem. Should i use some other parser.
- I also successfully read the same page (line by line) by java.io.BufferedReader method. But to futher parse it by dom4j the root element is inaccessible (it has null value).
Thanks for your help.
Regards,
Shuaib
------------------------------------------------------- This SF.Net email is sponsored by Oracle Space Sweepstakes Want to be the first software developer in space? Enter now for the Oracle Space Sweepstakes! http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click _______________________________________________ dom4j-dev mailing list dom4j-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dom4j-dev