New submission from Sharmila Sivakumar:

I try to load the data in the testdata.txt file into a dom.

I tried 
import xml.dom.minidom as dom
data = open('testdata.txt','r').read()
mydom = dom.parseString(data)
I get the following error

>>> mydom.firstChild.childNodes
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2022' in 
position 18: ordinal not in range(128)


So I tried decoding the data and using it but it failed again.

>>> mydom2 = dom.parseString(data.decode('utf-8'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/site-packages/_xmlplus/dom/minidom.py", line 
1925, in parseString
    return expatbuilder.parseString(string)
  File "/usr/lib/python2.5/site-packages/_xmlplus/dom/expatbuilder.py", 
line 942, in parseString
    return builder.parseString(string)
  File "/usr/lib/python2.5/site-packages/_xmlplus/dom/expatbuilder.py", 
line 223, in parseString
    parser.Parse(string, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u014d' in 
position 173: ordinal not in range(128)


I am willing to fix this myself if I'm given the permission.

----------
components: Interpreter Core, Unicode, XML
files: testdata.txt
messages: 56511
nosy: sharmila
severity: normal
status: open
title: xml.dom.minidom not able to handle utf-8 data
type: compile error
versions: Python 2.5
Added file: http://bugs.python.org/file8558/testdata.txt

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1290>
__________________________________
<xml><i><c><co>noun</co></c></i>
 <co><b>Etymology:</b> Middle English, from Anglo-French <i>dragun,</i> from 
Latin <i>dracon-, draco</i> serpent, dragon, from Greek <i>drakōn</i> serpent; 
akin to Old English <i>torht</i> bright, Greek <i>derkesthai</i> to see, look 
at</co>
 <co><b>Date:</b> 13th century</co>
 <c><b>1.</b></c> <i><c>archaic</c></i> <dtrn> a huge serpent</dtrn>
 <b><c>2.</c></b> <dtrn> a mythical animal usually represented as a monstrous 
winged and scaly serpent or saurian with a crested head and enormous 
claws</dtrn>
 <b><c>3.</c></b> <dtrn> a violent, combative, or very strict person</dtrn>
 <b><c>4.</c></b> <i><c>capitalized</c></i> <dtrn> <kref>Draco</kref></dtrn>
 <b><c>5.</c></b> <dtrn> something or someone formidable or baneful</dtrn>
 • <b>dragonish</b> <i><c><co>adjective</co></c></i></xml>
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to