Yo,

We are using TAL for things other than ZPT. but are having problems with files that include a BOM preamble.

the problem is that althought the underlying XML parser is capable of parsing these kind of files, TALParser initialises his parent without encoding (XMLParser.__init__(self) in TALParser.py line 27)

Anyway,
I have attached a small example (test.py + test.ml) that illustrates the problem with Zope 2.7.1.


running the test gives:

UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 0: ordinal not in range(128)
which is perfectly logical: feff (the start of the bom preamble) is not ascii.


chipping away the preamble (data=data[4:] ) gives problems further on in the file as the test example has some german characters (ä)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 50: ordinal not in range(128) which is also perfectly logical: ä has code 132.


My question is simply: why is TALParser not taking the encoding into acount ? Is this deliberate, or is it an oversight ?



Romain Slootmaekers.



<?xml version="1.0" encoding="UTF-8" ?> 
<fails>
Archäologe Paläo-Anthropologie
</fails>
#
#
#
from xml.dom.minidom import parseString

import sys
from TAL.TALParser import TALParser
from TAL.TALInterpreter import TALInterpreter
from TAL.DummyEngine import DummyEngine
import StringIO

import codecs



print sys.getdefaultencoding()

def readData():
    f = open('test.xml','r')
    
    readerClass = codecs.getreader('utf8')
    print readerClass
    reader = readerClass(f)
    data = reader.read()
    f.close()
    print "size = %s" % len(data)
    return data


def expand(xml):
    
    parser = TALParser()
    xml = xml[4:]
    parser.parseString(xml)
    program, macros = parser.getCode()
    engine = DummyEngine(0)
    out = StringIO.StringIO()
    interpreter = TALInterpreter(program,macros,engine,stream=out)
    interpreter()
    result = out.getvalue()
    
    return result

data = readData()
expanded = expand(data)
document = parseString(expanded)

print "ok"
_______________________________________________
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )

Reply via email to