[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

Mark Dickinson Thu, 18 Feb 2010 13:53:24 -0800

Mark Dickinson <[email protected]> added the comment:

Specifically, the behaviour comes from an early check for empty strings in the 
PyUnicode_FromEncodedObject function:


    /* Convert to Unicode */
    if (len == 0) {
        Py_INCREF(unicode_empty);
        v = (PyObject *)unicode_empty;
    }
    else
        v = PyUnicode_Decode(s, len, encoding, errors);

It's not until PyUnicode_Decode that there's any attempt to make sense of 
'encoding'.  This doesn't seem like a serious bug to me, but I agree that it 
would be cleaner to fail on unknown encodings even for an empty string.

Ori:  are you interested in working on a patch?

----------
components: +Interpreter Core -None
nosy: +mark.dickinson
stage:  -> needs patch

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue7961>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

Reply via email to