Fredrik Lundh wrote:
> Steven Bethard wrote:
>
>> Hmm... I downloaded the newest cElementTree (and I already had the
>> newest ElementTree), and here's what I get:
>
>> >>> tree = myparser(filename, 'gbk')
>> Traceback (most recent call last):
>>File "", line 1, in ?
>>File "", line 8,
Steven Bethard wrote:
> Hmm... I downloaded the newest cElementTree (and I already had the
> newest ElementTree), and here's what I get:
> >>> tree = myparser(filename, 'gbk')
> Traceback (most recent call last):
>File "", line 1, in ?
>File "", line 8, in myparser
> SyntaxError: not we
Fredrik Lundh wrote:
> Steven Bethard wrote:
>
>> I'm having trouble using elementtree with an XML file that has some
>> gbk-encoded text. (I can't read Chinese, so I'm taking their word for
>> it that it's gbk-encoded.) I always have trouble with encodings, so I'm
>> sure I'm just screwing some
Diez B. Roggisch wrote:
> Interestingly enough, that has not to be the case. A document can very well
> be well-formed without a header. The constraints for well-formedness are
> scattered throughout the spec, so I'm not sure what they say about the used
> encoding in absence of a header.
if ther
> no, the parser must not to choke on a file for which the encoding has been
> overridden.
>
> for example, the HTTP standard allows the transport layer to recode text/*
> re- sources as long as it updates the charset properly, so if you e.g send
> an XML document as text/xml and charset=iso-8859-
Diez B. Roggisch wrote:
>> good advice, but note that an envelope (e.g a HTTP request or response
>> body) may override the encoding in the XML file itself. if this arrives
>> in a MIME message with the proper charset information, it's perfectly okay
>> to leave out the encoding from the file.
>
> pyexpat has only limited support for non-standard encodings; the core
> expat library only supports UTF-8, UTF-16, US-ASCII, and ISO-8859-1,
> and the Python glue layer then adds support for all byte-to-byte en-
> codings support by Python on top of that.
Interesting.
Maybe 4suite is more compl
Hi,
> good advice, but note that an envelope (e.g a HTTP request or response
> body) may override the encoding in the XML file itself. if this arrives
> in a MIME message with the proper charset information, it's perfectly okay
> to leave out the encoding from the file.
It might be practical - s
Diez B. Roggisch wrote:
> 2) your xml is _not_ well-formed, as it doesn't contain a xml-header!
> You need ask these guys to deliver the xml with header. Of course for
> now it is ok to just prepend the text with something like version="1.0" encoding="gbk"?>. But I'd still request them to deliv
Steven Bethard wrote:
> I'm having trouble using elementtree with an XML file that has some
> gbk-encoded text. (I can't read Chinese, so I'm taking their word for
> it that it's gbk-encoded.) I always have trouble with encodings, so I'm
> sure I'm just screwing something simple up. Can anyone
Diez B. Roggisch wrote:
>> Here's what I get with the prepending hack:
>>
>> >>> et.fromstring('\n' +
>> open(filename).read())
>> Traceback (most recent call last):
>> File "", line 1, in ?
>> File "C:\Program
>> Files\Python\lib\site-packages\elementtree\ElementTree.py", line 960,
>> in X
> Here's what I get with the prepending hack:
>
> >>> et.fromstring('\n' +
> open(filename).read())
> Traceback (most recent call last):
> File "", line 1, in ?
> File "C:\Program
> Files\Python\lib\site-packages\elementtree\ElementTree.py", line 960, in
> XML
> parser.feed(text)
> F
Diez B. Roggisch wrote:
> Steven Bethard schrieb:
>> I'm having trouble using elementtree with an XML file that has some
>> gbk-encoded text. (I can't read Chinese, so I'm taking their word for
>> it that it's gbk-encoded.) I always have trouble with encodings, so
>> I'm sure I'm just screwing
Steven Bethard schrieb:
> I'm having trouble using elementtree with an XML file that has some
> gbk-encoded text. (I can't read Chinese, so I'm taking their word for
> it that it's gbk-encoded.) I always have trouble with encodings, so I'm
> sure I'm just screwing something simple up. Can any
14 matches
Mail list logo