dieter schrieb am 23.05.2018 um 08:25: > If the encoding is not specified, "lxml" will try to determine it > and finally defaults to "utf-8" (which seems to be the correct encoding > for your case).
Being an XML parser, it does not do that. XML parsers are designed to reject non-wellformed content, and that includes anything that cannot be decoded. In short, if no encoding is specified, then it's UTF-8, but if there is an XML declaration that specifies that encoding, then it uses that encoding. Here, the encoding is specifed as UTF-8, so that's what the parser uses. Note, however, that the library that the OP uses is not lxml but xml.etree, i.e. the ElementTree XML support in the standard library. Stefan -- https://mail.python.org/mailman/listinfo/python-list