minidom and unicode errors

Abhimanyu Seth Mon, 06 Mar 2006 21:41:42 -0800

Hi all,

I'm trying to parse and modify an XML document using xml.dom.minidom module and Python 2.4.2

>> from xml.dom import minidom
>> dom = minidom.parse ("c:/test.txt")

If the xml file contains a non-ascii character, then i get a parse error.
I have the following line in my xml file:
<target>Exception beim Löschen des Audit-Moduls aufgetreten. Exception Stack lautet: %1.</target>
ExpatError: not well-formed (invalid token): line 8, column 27

If I remove the ö character, then it works fine. I'm guessing this has to do with the default encoding which is ascii. I guess i can change the encoding by modifying a file on my machine that the interpretter reads while loading, but then how do I get my program to work on different machines?

Also, while writing such a special character to the file, I get an error.
>> document.writexml (file (myFile, "w"), encoding='utf-8')

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 16: ordinal not in range(128)

Any help would be appreciated.

--
Regards,
Abhimanyu

-- 
http://mail.python.org/mailman/listinfo/python-list

minidom and unicode errors

Reply via email to