[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-25 Thread Kurt Seifried

Kurt Seifried kseifr...@redhat.com added the comment:

Please use CVE-2012-2135 for this issue as per 
http://www.openwall.com/lists/oss-security/2012/04/25/3

--
nosy: +kseifr...@redhat.com

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-25 Thread Huzaifa Sidhpurwala

Huzaifa Sidhpurwala sidhpurwala.huza...@gmail.com added the comment:

I have not tried the patch yet, but modifying the reproducer yields a different 
crash. This one seems to be a heap-based buffer overflow which is slightly more 
serious.

In the reproducer, you just need to replace ascii() with str().

Again works on python3 only.

--
nosy: +Huzaifa.Sidhpurwala

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-25 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

I now write tests and I have a question. Should 
b'\xd8\x00\x41'.decode('utf-16be', 'replace') to give '\xfffd' or 
'\xfffd\xfffd'?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-25 Thread Kurt Seifried

Changes by Kurt Seifried kseifr...@redhat.com:


--
nosy:  -kseifr...@redhat.com

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-25 Thread Henri Salo

Henri Salo he...@nerv.fi added the comment:

Debian bug-report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=670389
Found in versions python3-defaults/3.2.3~rc1-2, 
python3-defaults/3.1.3-12+squeeze1

--
nosy: +Henri.Salo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-25 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
nosy: +benjamin.peterson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-25 Thread Henri Salo

Henri Salo he...@nerv.fi added the comment:

I tested versions 3.1.1, 3.1.2, 3.1.3, 3.1.4 and 3.1.5 and only 3.1.3 crashed 
with Segmentation fault:

Program received signal SIGSEGV, Segmentation fault.
0x004c483a in PyObject_Call (func=0x77e4d3b0, arg=0x770fd410, 
kw=0x0) at Objects/abstract.c:2156
2156if ((call = func-ob_type-tp_call) != NULL) {

(gdb) bt
#0  0x004c483a in PyObject_Call (func=0x77e4d3b0, 
arg=0x770fd410, kw=0x0) at Objects/abstract.c:2156
#1  0x0045c437 in do_call (f=0x8929b0, throwflag=value optimized out) 
at Python/ceval.c:3982
#2  call_function (f=0x8929b0, throwflag=value optimized out) at 
Python/ceval.c:3785
#3  PyEval_EvalFrameEx (f=0x8929b0, throwflag=value optimized out) at 
Python/ceval.c:2548
#4  0x0045e675 in PyEval_EvalCodeEx (co=0x77159e30, globals=value 
optimized out, locals=value optimized out, args=0x0, argcount=1, kws=value 
optimized out, 
kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at 
Python/ceval.c:3198
#5  0x0045e77b in PyEval_EvalCode (co=0x77e4d3b0, 
globals=0x770fd410, locals=0x0) at Python/ceval.c:668
#6  0x004800b2 in run_mod (fp=value optimized out, filename=value 
optimized out, flags=0x7fffe390) at Python/pythonrun.c:1711
#7  PyRun_InteractiveOneFlags (fp=value optimized out, filename=value 
optimized out, flags=0x7fffe390) at Python/pythonrun.c:1104
#8  0x004803ce in PyRun_InteractiveLoopFlags (fp=0x775346a0, 
filename=0x5312a1 stdin, flags=0x7fffe390) at Python/pythonrun.c:1006
#9  0x00480bab in PyRun_AnyFileExFlags (fp=0x775346a0, 
filename=0x5312a1 stdin, closeit=0, flags=0x7fffe390) at 
Python/pythonrun.c:975
#10 0x00496422 in Py_Main (argc=value optimized out, argv=value 
optimized out) at Modules/main.c:607
#11 0x00416e6e in main (argc=value optimized out, argv=value 
optimized out) at ./Modules/python.c:152

--
versions: +Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-25 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

I thought it was one error, and not two.

The updated patch adds tests and fixes minor mistake. 2.7 is not
affected by main security issue, but it contains one of mentioned bugs
(read 1 byte outside of the input array). A patch for 2.7 fixes this bug
and also includes tests.

--
Added file: http://bugs.python.org/file25366/utf16_error_handling-3.2_4.patch
Added file: http://bugs.python.org/file25367/utf16_error_handling-2.7.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___diff -r b07488490001 Lib/test/test_codecs.py
--- a/Lib/test/test_codecs.py   Fri Apr 20 14:36:47 2012 +0200
+++ b/Lib/test/test_codecs.py   Wed Apr 25 20:08:37 2012 +0300
@@ -540,8 +540,19 @@
 )
 
 def test_errors(self):
-self.assertRaises(UnicodeDecodeError, codecs.utf_16_le_decode,
-  b\xff, strict, True)
+tests = [
+(b'\xff', '\ufffd'),
+(b'A\x00Z', 'A\ufffd'),
+(b'A\x00B\x00C\x00D\x00Z', 'ABCD\ufffd'),
+(b'\x00\xd8', '\ufffd'),
+(b'\x00\xd8A', '\ufffd'),
+(b'\x00\xd8A\x00', '\ufffdA'),
+(b'\x00\xdcA\x00', '\ufffdA'),
+]
+for raw, expected in tests:
+self.assertRaises(UnicodeDecodeError, codecs.utf_16_le_decode,
+  raw, 'strict', True)
+self.assertEqual(raw.decode('utf-16le', 'replace'), expected)
 
 def test_nonbmp(self):
 self.assertEqual(\U00010203.encode(self.encoding),
@@ -568,8 +579,19 @@
 )
 
 def test_errors(self):
-self.assertRaises(UnicodeDecodeError, codecs.utf_16_be_decode,
-  b\xff, strict, True)
+tests = [
+(b'\xff', '\ufffd'),
+(b'\x00A\xff', 'A\ufffd'),
+(b'\x00A\x00B\x00C\x00DZ', 'ABCD\ufffd'),
+(b'\xd8\x00', '\ufffd'),
+(b'\xd8\x00\xdc', '\ufffd'),
+(b'\xd8\x00\x00A', '\ufffdA'),
+(b'\xdc\x00\x00A', '\ufffdA'),
+]
+for raw, expected in tests:
+self.assertRaises(UnicodeDecodeError, codecs.utf_16_be_decode,
+  raw, 'strict', True)
+self.assertEqual(raw.decode('utf-16be', 'replace'), expected)
 
 def test_nonbmp(self):
 self.assertEqual(\U00010203.encode(self.encoding),
diff -r b07488490001 Objects/unicodeobject.c
--- a/Objects/unicodeobject.c   Fri Apr 20 14:36:47 2012 +0200
+++ b/Objects/unicodeobject.c   Wed Apr 25 20:08:37 2012 +0300
@@ -3425,7 +3425,7 @@
 /* Unpack UTF-16 encoded data */
 p = unicode-str;
 q = (unsigned char *)s;
-e = q + size - 1;
+e = q + size;
 
 if (byteorder)
 bo = *byteorder;
@@ -3476,8 +3476,20 @@
 #endif
 
 aligned_end = (const unsigned char *) ((size_t) e  ~LONG_PTR_MASK);
-while (q  e) {
+while (1) {
 Py_UNICODE ch;
+if (e - q  2) {
+/* remaining byte at the end? (size should be even) */
+if (q == e || consumed)
+break;
+errmsg = truncated data;
+startinpos = ((const char *)q) - starts;
+endinpos = ((const char *)e) - starts;
+outpos = p - PyUnicode_AS_UNICODE(unicode);
+goto utf16Error;
+/* The remaining input chars are ignored if the callback
+   chooses to skip the input */
+}
 /* First check for possible aligned read of a C 'long'. Unaligned
reads are more expensive, better to defer to another iteration. */
 if (!((size_t) q  LONG_PTR_MASK)) {
@@ -3546,8 +3558,8 @@
 }
 p = _p;
 q = _q;
-if (q = e)
-break;
+if (e - q  2)
+continue;
 }
 ch = (q[ihi]  8) | q[ilo];
 
@@ -3559,10 +3571,10 @@
 }
 
 /* UTF-16 code pair: */
-if (q  e) {
+if (e - q  2) {
 errmsg = unexpected end of data;
 startinpos = (((const char *)q) - 2) - starts;
-endinpos = ((const char *)e) + 1 - starts;
+endinpos = ((const char *)e) - starts;
 goto utf16Error;
 }
 if (0xD800 = ch  ch = 0xDBFF) {
@@ -3606,31 +3618,9 @@
 outpos,
 p))
 goto onError;
-}
-/* remaining byte at the end? (size should be even) */
-if (e == q) {
-if (!consumed) {
-errmsg = truncated data;
-startinpos = ((const char *)q) - starts;
-endinpos = ((const char *)e) + 1 - starts;
-outpos = p - PyUnicode_AS_UNICODE(unicode);
-if (unicode_decode_call_errorhandler(
-errors,
-errorHandler,
-utf16, errmsg,
-starts,
-   

[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-24 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

Now I see the problem: make_decode_exception creates a new bytes object in any 
case, regardless of whether the error handler will update it or not. Therefore, 
decoding will continue in this new bytes object.

I think the same issue also applies to the ASCII decoder in 3.3.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-24 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

 I think the same issue also applies to the ASCII decoder in 3.3.

No, the ASCII decoder is not affected by this vulnerability. In a loop,
in which unicode_decode_call_errorhandler is called, do not use any
cached and not-updatable data.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-24 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
title: Possible vulnerability in the utf-16 decoder after error handling - 
Vulnerability in the utf-16 decoder after error handling

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

[moving from Rietveld back to Roundup]

On 2012/04/20 11:15:48, storchaka wrote:
 The `aligned_end` may point outside unicode object, 
 if the unicode object was reallocated.

How so? The aligned_end *never* points into the unicode object:

q = (unsigned char *)s;
e = q + size - 1;
aligned_end = (const unsigned char *) ((size_t) e  ~LONG_PTR_MASK);

So aligned_end points into s, not into the unicode object. 
So this adjustment is necessary because the *input* may change in the callback,
not because the output may change. So the comment in decode_utf8_errors seems
just as wrong.

Why this is relevant to this issue, is unclear to me, though: the ignore handler
doesn't modify the input object.

--
nosy: +loewis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

 So this adjustment is necessary because the *input* may change in the 
 callback,
 not because the output may change. So the comment in decode_utf8_errors seems
 just as wrong.

You're right, and my eyes in a lather. Now I saw it.

What you have to offer any comment? If someone would correct a comment
for decode_utf8_errors, I just copied it.

 Why this is relevant to this issue, is unclear to me, though: the ignore 
 handler
 doesn't modify the input object.

I first got the crash using a custom handler, and then I saw that
ignore handler is enough. Even if the ignore handler does not have
to change the input object, other handlers can do it and this is the
reason for the crash remains.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

 You're right, and my eyes in a lather. Now I saw it.
 
 What you have to offer any comment? If someone would correct a comment
 for decode_utf8_errors, I just copied it.

might have changed the input object

 Why this is relevant to this issue, is unclear to me, though: the ignore 
 handler
 doesn't modify the input object.
 
 I first got the crash using a custom handler, and then I saw that
 ignore handler is enough. Even if the ignore handler does not have
 to change the input object, other handlers can do it and this is the
 reason for the crash remains.

I agree that the change is necessary. It just does not explain why it
fixes this issue.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Added file: http://bugs.python.org/file25293/utf16_error_handling-3.2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Here is a minimal patch that corrects all bugs for 3.2. As a side effect, 
decoding is accelerated by 4-8%.

--
Added file: http://bugs.python.org/file25294/utf16_error_handling-3.2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Removed file: http://bugs.python.org/file25293/utf16_error_handling-3.2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Added file: http://bugs.python.org/file25295/utf16_update_after_error-3.2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-20 Thread Andrew Svetlov

Changes by Andrew Svetlov andrew.svet...@gmail.com:


--
nosy: +asvetlov

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-19 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Here is the bugs in the utf-16 decoder:

1. `aligned_end` is not updated after calling error handler.

2. Possible silent reading of one byte over the bytes array limit when decoding 
of a surrogate pair. b'\xD8\x00\xDC'.decode('utf-16be')

3. Error handlers receive data without last byte.

4. After handling truncate data error it is impossible to continue decoding 
(unlike all the other decoders).

--
title: Possible vulnerability in the utf-16 decoder after error handling - 
Vulnerability in the utf-16 decoder after error handling

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14579] Vulnerability in the utf-16 decoder after error handling

2012-04-19 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

The proposed patch will fix only the first of these bugs. The patch in issue 
#14624 fixes all bugs for Python 3.3. For Python 3.2 soon I will make a patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14579
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com