Re: [Python-Dev] XML codec?

Martin v. Löwis Sun, 11 Nov 2007 05:40:52 -0800

> I don't know. Is an XML document ill-formed if it doesn't contain an
> XML declaration, is not in UTF-8 or UTF-8, but there's external
> encoding info?


If there is external encoding info, matching the actual encoding,
it would be well-formed. Of course, preserving that information would
be up to the application.

> This looks good. Now we would have to extent the code to detect and
> replace the encoding in the XML declaration too.

I'm still opposed to making this a codec. Right - for a pure Python
solution, the processing of the XML declaration would still need to
be implemented.

>> I think there could be a much simpler routine to have the same 
>> effect. - if it's less than 4 bytes, answer "need more data".
> 
> Can there be an XML document that is less then 4 bytes? I guess not.

No, the smallest document has exactly 4 characters (e.g. "<f/>").
However, external entities may be smaller, such as "x".

> But anyway: would a Python implementation of these two functions
> (detect_encoding()/fix_encoding()) be accepted?

I could agree to a Python implementation of this algorithm as long
as it's not packaged as a codec.

Regards,
Martin

_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] XML codec?

Reply via email to