Mark Dickinson <[email protected]> added the comment:
Specifically, the behaviour comes from an early check for empty strings in the
PyUnicode_FromEncodedObject function:
/* Convert to Unicode */
if (len == 0) {
Py_INCREF(unicode_empty);
v = (PyObject *)unicode_empty;
}
else
v = PyUnicode_Decode(s, len, encoding, errors);
It's not until PyUnicode_Decode that there's any attempt to make sense of
'encoding'. This doesn't seem like a serious bug to me, but I agree that it
would be cleaner to fail on unknown encodings even for an empty string.
Ori: are you interested in working on a patch?
----------
components: +Interpreter Core -None
nosy: +mark.dickinson
stage: -> needs patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue7961>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com