[issue14579] Vulnerability in the utf-16 decoder after error handling
Kurt Seifried kseifr...@redhat.com added the comment: Please use CVE-2012-2135 for this issue as per http://www.openwall.com/lists/oss-security/2012/04/25/3 -- nosy: +kseifr...@redhat.com ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Huzaifa Sidhpurwala sidhpurwala.huza...@gmail.com added the comment: I have not tried the patch yet, but modifying the reproducer yields a different crash. This one seems to be a heap-based buffer overflow which is slightly more serious. In the reproducer, you just need to replace ascii() with str(). Again works on python3 only. -- nosy: +Huzaifa.Sidhpurwala ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Serhiy Storchaka storch...@gmail.com added the comment: I now write tests and I have a question. Should b'\xd8\x00\x41'.decode('utf-16be', 'replace') to give '\xfffd' or '\xfffd\xfffd'? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Changes by Kurt Seifried kseifr...@redhat.com: -- nosy: -kseifr...@redhat.com ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Henri Salo he...@nerv.fi added the comment: Debian bug-report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=670389 Found in versions python3-defaults/3.2.3~rc1-2, python3-defaults/3.1.3-12+squeeze1 -- nosy: +Henri.Salo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +benjamin.peterson ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Henri Salo he...@nerv.fi added the comment: I tested versions 3.1.1, 3.1.2, 3.1.3, 3.1.4 and 3.1.5 and only 3.1.3 crashed with Segmentation fault: Program received signal SIGSEGV, Segmentation fault. 0x004c483a in PyObject_Call (func=0x77e4d3b0, arg=0x770fd410, kw=0x0) at Objects/abstract.c:2156 2156if ((call = func-ob_type-tp_call) != NULL) { (gdb) bt #0 0x004c483a in PyObject_Call (func=0x77e4d3b0, arg=0x770fd410, kw=0x0) at Objects/abstract.c:2156 #1 0x0045c437 in do_call (f=0x8929b0, throwflag=value optimized out) at Python/ceval.c:3982 #2 call_function (f=0x8929b0, throwflag=value optimized out) at Python/ceval.c:3785 #3 PyEval_EvalFrameEx (f=0x8929b0, throwflag=value optimized out) at Python/ceval.c:2548 #4 0x0045e675 in PyEval_EvalCodeEx (co=0x77159e30, globals=value optimized out, locals=value optimized out, args=0x0, argcount=1, kws=value optimized out, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:3198 #5 0x0045e77b in PyEval_EvalCode (co=0x77e4d3b0, globals=0x770fd410, locals=0x0) at Python/ceval.c:668 #6 0x004800b2 in run_mod (fp=value optimized out, filename=value optimized out, flags=0x7fffe390) at Python/pythonrun.c:1711 #7 PyRun_InteractiveOneFlags (fp=value optimized out, filename=value optimized out, flags=0x7fffe390) at Python/pythonrun.c:1104 #8 0x004803ce in PyRun_InteractiveLoopFlags (fp=0x775346a0, filename=0x5312a1 stdin, flags=0x7fffe390) at Python/pythonrun.c:1006 #9 0x00480bab in PyRun_AnyFileExFlags (fp=0x775346a0, filename=0x5312a1 stdin, closeit=0, flags=0x7fffe390) at Python/pythonrun.c:975 #10 0x00496422 in Py_Main (argc=value optimized out, argv=value optimized out) at Modules/main.c:607 #11 0x00416e6e in main (argc=value optimized out, argv=value optimized out) at ./Modules/python.c:152 -- versions: +Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Serhiy Storchaka storch...@gmail.com added the comment: I thought it was one error, and not two. The updated patch adds tests and fixes minor mistake. 2.7 is not affected by main security issue, but it contains one of mentioned bugs (read 1 byte outside of the input array). A patch for 2.7 fixes this bug and also includes tests. -- Added file: http://bugs.python.org/file25366/utf16_error_handling-3.2_4.patch Added file: http://bugs.python.org/file25367/utf16_error_handling-2.7.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___diff -r b07488490001 Lib/test/test_codecs.py --- a/Lib/test/test_codecs.py Fri Apr 20 14:36:47 2012 +0200 +++ b/Lib/test/test_codecs.py Wed Apr 25 20:08:37 2012 +0300 @@ -540,8 +540,19 @@ ) def test_errors(self): -self.assertRaises(UnicodeDecodeError, codecs.utf_16_le_decode, - b\xff, strict, True) +tests = [ +(b'\xff', '\ufffd'), +(b'A\x00Z', 'A\ufffd'), +(b'A\x00B\x00C\x00D\x00Z', 'ABCD\ufffd'), +(b'\x00\xd8', '\ufffd'), +(b'\x00\xd8A', '\ufffd'), +(b'\x00\xd8A\x00', '\ufffdA'), +(b'\x00\xdcA\x00', '\ufffdA'), +] +for raw, expected in tests: +self.assertRaises(UnicodeDecodeError, codecs.utf_16_le_decode, + raw, 'strict', True) +self.assertEqual(raw.decode('utf-16le', 'replace'), expected) def test_nonbmp(self): self.assertEqual(\U00010203.encode(self.encoding), @@ -568,8 +579,19 @@ ) def test_errors(self): -self.assertRaises(UnicodeDecodeError, codecs.utf_16_be_decode, - b\xff, strict, True) +tests = [ +(b'\xff', '\ufffd'), +(b'\x00A\xff', 'A\ufffd'), +(b'\x00A\x00B\x00C\x00DZ', 'ABCD\ufffd'), +(b'\xd8\x00', '\ufffd'), +(b'\xd8\x00\xdc', '\ufffd'), +(b'\xd8\x00\x00A', '\ufffdA'), +(b'\xdc\x00\x00A', '\ufffdA'), +] +for raw, expected in tests: +self.assertRaises(UnicodeDecodeError, codecs.utf_16_be_decode, + raw, 'strict', True) +self.assertEqual(raw.decode('utf-16be', 'replace'), expected) def test_nonbmp(self): self.assertEqual(\U00010203.encode(self.encoding), diff -r b07488490001 Objects/unicodeobject.c --- a/Objects/unicodeobject.c Fri Apr 20 14:36:47 2012 +0200 +++ b/Objects/unicodeobject.c Wed Apr 25 20:08:37 2012 +0300 @@ -3425,7 +3425,7 @@ /* Unpack UTF-16 encoded data */ p = unicode-str; q = (unsigned char *)s; -e = q + size - 1; +e = q + size; if (byteorder) bo = *byteorder; @@ -3476,8 +3476,20 @@ #endif aligned_end = (const unsigned char *) ((size_t) e ~LONG_PTR_MASK); -while (q e) { +while (1) { Py_UNICODE ch; +if (e - q 2) { +/* remaining byte at the end? (size should be even) */ +if (q == e || consumed) +break; +errmsg = truncated data; +startinpos = ((const char *)q) - starts; +endinpos = ((const char *)e) - starts; +outpos = p - PyUnicode_AS_UNICODE(unicode); +goto utf16Error; +/* The remaining input chars are ignored if the callback + chooses to skip the input */ +} /* First check for possible aligned read of a C 'long'. Unaligned reads are more expensive, better to defer to another iteration. */ if (!((size_t) q LONG_PTR_MASK)) { @@ -3546,8 +3558,8 @@ } p = _p; q = _q; -if (q = e) -break; +if (e - q 2) +continue; } ch = (q[ihi] 8) | q[ilo]; @@ -3559,10 +3571,10 @@ } /* UTF-16 code pair: */ -if (q e) { +if (e - q 2) { errmsg = unexpected end of data; startinpos = (((const char *)q) - 2) - starts; -endinpos = ((const char *)e) + 1 - starts; +endinpos = ((const char *)e) - starts; goto utf16Error; } if (0xD800 = ch ch = 0xDBFF) { @@ -3606,31 +3618,9 @@ outpos, p)) goto onError; -} -/* remaining byte at the end? (size should be even) */ -if (e == q) { -if (!consumed) { -errmsg = truncated data; -startinpos = ((const char *)q) - starts; -endinpos = ((const char *)e) + 1 - starts; -outpos = p - PyUnicode_AS_UNICODE(unicode); -if (unicode_decode_call_errorhandler( -errors, -errorHandler, -utf16, errmsg, -starts, -
[issue14579] Vulnerability in the utf-16 decoder after error handling
Martin v. Löwis mar...@v.loewis.de added the comment: Now I see the problem: make_decode_exception creates a new bytes object in any case, regardless of whether the error handler will update it or not. Therefore, decoding will continue in this new bytes object. I think the same issue also applies to the ASCII decoder in 3.3. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Serhiy Storchaka storch...@gmail.com added the comment: I think the same issue also applies to the ASCII decoder in 3.3. No, the ASCII decoder is not affected by this vulnerability. In a loop, in which unicode_decode_call_errorhandler is called, do not use any cached and not-updatable data. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Changes by Serhiy Storchaka storch...@gmail.com: -- title: Possible vulnerability in the utf-16 decoder after error handling - Vulnerability in the utf-16 decoder after error handling ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com: -- nosy: +Arfrever ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Martin v. Löwis mar...@v.loewis.de added the comment: [moving from Rietveld back to Roundup] On 2012/04/20 11:15:48, storchaka wrote: The `aligned_end` may point outside unicode object, if the unicode object was reallocated. How so? The aligned_end *never* points into the unicode object: q = (unsigned char *)s; e = q + size - 1; aligned_end = (const unsigned char *) ((size_t) e ~LONG_PTR_MASK); So aligned_end points into s, not into the unicode object. So this adjustment is necessary because the *input* may change in the callback, not because the output may change. So the comment in decode_utf8_errors seems just as wrong. Why this is relevant to this issue, is unclear to me, though: the ignore handler doesn't modify the input object. -- nosy: +loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Serhiy Storchaka storch...@gmail.com added the comment: So this adjustment is necessary because the *input* may change in the callback, not because the output may change. So the comment in decode_utf8_errors seems just as wrong. You're right, and my eyes in a lather. Now I saw it. What you have to offer any comment? If someone would correct a comment for decode_utf8_errors, I just copied it. Why this is relevant to this issue, is unclear to me, though: the ignore handler doesn't modify the input object. I first got the crash using a custom handler, and then I saw that ignore handler is enough. Even if the ignore handler does not have to change the input object, other handlers can do it and this is the reason for the crash remains. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Martin v. Löwis mar...@v.loewis.de added the comment: You're right, and my eyes in a lather. Now I saw it. What you have to offer any comment? If someone would correct a comment for decode_utf8_errors, I just copied it. might have changed the input object Why this is relevant to this issue, is unclear to me, though: the ignore handler doesn't modify the input object. I first got the crash using a custom handler, and then I saw that ignore handler is enough. Even if the ignore handler does not have to change the input object, other handlers can do it and this is the reason for the crash remains. I agree that the change is necessary. It just does not explain why it fixes this issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25293/utf16_error_handling-3.2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Serhiy Storchaka storch...@gmail.com added the comment: Here is a minimal patch that corrects all bugs for 3.2. As a side effect, decoding is accelerated by 4-8%. -- Added file: http://bugs.python.org/file25294/utf16_error_handling-3.2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Changes by Serhiy Storchaka storch...@gmail.com: Removed file: http://bugs.python.org/file25293/utf16_error_handling-3.2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25295/utf16_update_after_error-3.2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Changes by Andrew Svetlov andrew.svet...@gmail.com: -- nosy: +asvetlov ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Serhiy Storchaka storch...@gmail.com added the comment: Here is the bugs in the utf-16 decoder: 1. `aligned_end` is not updated after calling error handler. 2. Possible silent reading of one byte over the bytes array limit when decoding of a surrogate pair. b'\xD8\x00\xDC'.decode('utf-16be') 3. Error handlers receive data without last byte. 4. After handling truncate data error it is impossible to continue decoding (unlike all the other decoders). -- title: Possible vulnerability in the utf-16 decoder after error handling - Vulnerability in the utf-16 decoder after error handling ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14579] Vulnerability in the utf-16 decoder after error handling
Serhiy Storchaka storch...@gmail.com added the comment: The proposed patch will fix only the first of these bugs. The patch in issue #14624 fixes all bugs for Python 3.3. For Python 3.2 soon I will make a patch. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14579 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com