Hi :-)
I currently have quite a big problem with minidom and special chars (for example ü) in HTML.


Let's say I have following input file:
--------------------------------------------------
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
            "http://www.w3.org/TR/html4/strict.dtd";>
<html>
<body>
&uuml;
</body>
</html>
--------------------------------------------------

And following python script:
--------------------------------------------------
from xml.dom import minidom
if __name__ == '__main__':
        doc = minidom.parse('test2.html')
        f = open('test3.html','w+')
        f.write(doc.toxml())
        f.close()
--------------------------------------------------

test3.html only has a blank line where should be the &uuml; It is simply removed.

Any idea how I could solve this problem?

MfG, Horst
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to