I'm trying to parse and modify an XML document using xml.dom.minidom module and Python 2.4.2
>> from xml.dom import minidom
>> dom = minidom.parse ("c:/test.txt")
If the xml file contains a non-ascii character, then i get a parse error.
I have the following line in my xml file:
<target>Exception beim Löschen des Audit-Moduls aufgetreten. Exception Stack lautet: %1.</target>
ExpatError: not well-formed (invalid token): line 8, column 27
If I remove the ö character, then it works fine. I'm guessing this has to do with the default encoding which is ascii. I guess i can change the encoding by modifying a file on my machine that the interpretter reads while loading, but then how do I get my program to work on different machines?
Also, while writing such a special character to the file, I get an error.
>> document.writexml (file (myFile, "w"), encoding='utf-8')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 16: ordinal not in range(128)
Any help would be appreciated.
--
Regards,
Abhimanyu
-- http://mail.python.org/mailman/listinfo/python-list