I forgot to add that I'm using elementtree to process the xml files and don't
(usually) have any problems with that. Plus, the workaround that works is to
encode each elementtree output ie.:
thisxmlline = thisxmlline.encode('utf8')
But, this seems odd to me as isn't it already being processed as utf-8?
Dinesh
From: Dinesh B Vadhia
Sent: Thursday, June 04, 2009 6:47 AM
To: [email protected]
Subject: unicode, utf-8 problem again
Hi! I'm processing a large number of xml files that are all declared as utf-8
encoded in the header ie.
<?xml version="1.0" encoding="UTF-8"?>
My Python environment has been set for 'utf-8' through site.py. Additionally,
the top of each program/module has the declaration:
# -*- coding: utf-8 -*-
But, I still get this error:
Traceback (most recent call last):
...
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position
76: ordinal not in range(128)
What am I missing?
Dinesh
_______________________________________________
Tutor maillist - [email protected]
http://mail.python.org/mailman/listinfo/tutor