[issue37461] email.parser.Parser hang

2019-07-17 Thread Guido Vranken


Guido Vranken  added the comment:

I used fuzzing to find this bug. After applying your patch, the infinite loop 
is gone and it cannot find any other bugs of this nature.

--

___
Python tracker 
<https://bugs.python.org/issue37461>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2019-07-08 Thread Guido Vranken


Guido Vranken  added the comment:

Hi,

I've built a generic Python fuzzer and submitted it to OSS-Fuzz.

It works by implementing a "def FuzzerRunOne(FuzzerInput):" function in Python 
in which some arbitrary code is run based on FuzzerInput, which is a bytes 
object.

This is a more versatile solution than the current re, json, csv fuzzers as it 
requires no custom C code and adding more fuzzing targets is as easy as writing 
a new harness in Python and adding a build rule.

Code coverage is measured at both the CPython level (*.c) and the Python level 
(*.py). CPython is compiled with AddressSanitizer. What this means is that both 
CPython memory bugs and Python library bugs (excessive memory consumption, 
hangs, slowdowns, unexpected exceptions) are expected to transpire.

You can see my current set of fuzzers here: 
https://github.com/guidovranken/python-library-fuzzers

The PR to OSS-Fuzz is https://github.com/google/oss-fuzz/pull/2567

Currently, the only Python maintainer who will be receiving automated bug 
reports is gpshead. Are there any other developers who normally process Python 
security bug reports and would like to receive notifications?

Feel free to respond directly in the OSS-Fuzz PR thread.

--
nosy: +Guido

___
Python tracker 
<https://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37461] email.parser.Parser hang

2019-06-30 Thread Guido Vranken


New submission from Guido Vranken :

The following will hang, and consume a large amount of memory:

from email.parser import BytesParser, Parser
from email.policy import default
payload = "".join(chr(c) for c in [0x43, 0x6f, 0x6e, 0x74, 0x65, 0x6e, 0x74, 
0x2d, 0x54, 0x79, 0x70, 0x65, 0x3a, 0x78, 0x3b, 0x61, 0x72, 0x1b, 0x2a, 0x3d, 
0x22, 0x73, 0x4f, 0x27, 0x23, 0x61, 0xff, 0xff, 0x27, 0x5c, 0x22])
Parser(policy=default).parsestr(payload)

--
components: email
messages: 346953
nosy: Guido, barry, r.david.murray
priority: normal
severity: normal
status: open
title: email.parser.Parser hang
type: crash
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue37461>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23165] Heap overwrite in Python/fileutils.c:_Py_char2wchar() on 32 bit systems due to malloc parameter overflow

2015-01-04 Thread Guido Vranken

New submission from Guido Vranken:

The vulnerability described here is exceedingly difficult to exploit, since 
there is no straight-forward way an attacker (someone who controls a Python 
script contents but not other values such as system environment variables), can 
control a relevant parameter to the vulnerable function (_Py_char2wchar in 
Python/fileutils.c). It is, however, important that it is remediated since 
unawareness of this vulnerability may cause an unsuspecting author to establish 
a link between user and the function parameter in future versions of Python.

Like I said, the vulnerability is caused by code in the _Py_char2wchar 
function. Indirectly this function is accessed through 
Objects/unicodeobject.c:PyUnicode_DecodeLocaleAndSize(), 
PyUnicode_DecodeFSDefaultAndSize(), PyUnicode_DecodeLocale, and some other 
functions.

As far as I know this can only be exploited on 32-bit architectures (whose 
overflow threshold of its registers is  2**32). The following description sets 
out from the latest Python 3.4 code retrieved from 
https://hg.python.org/cpython .

The problem lies in the computation of size of the buffer that will hold the 
wide char version of the input string:

--
Python/fileutils.c
--
 296 #ifdef HAVE_BROKEN_MBSTOWCS
 297 /* Some platforms have a broken implementation of
 298  * mbstowcs which does not count the characters that
 299  * would result from conversion.  Use an upper bound.
 300  */
 301 argsize = strlen(arg);
 302 #else
 303 argsize = mbstowcs(NULL, arg, 0);
 304 #endif
 ...
 ...
 306 res = (wchar_t *)PyMem_RawMalloc((argsize+1)*sizeof(wchar_t));

 and:

 331 argsize = strlen(arg) + 1;
 332 res = (wchar_t*)PyMem_RawMalloc(argsize*sizeof(wchar_t));

Both invocations to PyMem_RawMalloc are not preceded by code that asserts no 
overflow will occur as a result of multiplication of the length of 'arg' by 
sizeof(wchar_t), which is typically 4 bytes. It follows that on a 32-bit 
architecture, it is possible cause an internal overflow to occur through the 
supplication of a string whose size is = ((2**32)-1) / 4, which is 1 gigabyte. 
The supplication of a 1 GB (minus one byte) string will therefore result in a 
value of 0 being passed to PyMem_RawMalloc, because:

argsize = 1024*1024*1024-1
malloc_argument = ((argsize+1) * 4
print malloc_argument  0x
# prints '0'

Effectively this will result in an allocation of exactly 1 byte, since a 
parameter of 0 is automatically adjusted to 1 by the underlying 
_PyMem_RawMalloc():

--
Objects/obmalloc.c
--
  51 static void *
  52 _PyMem_RawMalloc(void *ctx, size_t size)
  53 {
  54 /* PyMem_Malloc(0) means malloc(1). Some systems would return NULL
  55for malloc(0), which would be treated as an error. Some platforms 
would
  56return a pointer with no memory behind it, which would break 
pymalloc.
  57To solve these problems, allocate an extra byte. */
  58 if (size == 0)
  59 size = 1;
  60 return malloc(size);
  61 }


Once the memory has been allocated, mbstowcs() is invoked:

--
Python/fileutils.c
--

 306 res = (wchar_t *)PyMem_RawMalloc((argsize+1)*sizeof(wchar_t));
 307 if (!res)
 308 goto oom;
 309 count = mbstowcs(res, arg, argsize+1);

In my test setup (latest 32 bit Debian), mbstowcs returns '0', meaning no bytes 
were written to 'res'.

Then, 'res' is iterated over and the iteration is halted as soon as a 
null-wchar or a wchar which is a surrogate:

--
Python/fileutils.c
--

 310 if (count != (size_t)-1) {
 311 wchar_t *tmp;
 312 /* Only use the result if it contains no
 313surrogate characters. */
 314 for (tmp = res; *tmp != 0 
 315  !Py_UNICODE_IS_SURROGATE(*tmp); tmp++)
 316 ;
 317 if (*tmp == 0) {
 318 if (size != NULL)
 319 *size = count;
 320 return res;
 321 }
 322 }
 323 PyMem_RawFree(res);


Py_UNICODE_IS_SURROGATE is defined as follows:

--
Include/unicodeobject.h
--
 183 #define Py_UNICODE_IS_SURROGATE(ch) (0xD800 = (ch)  (ch) = 0xDFFF)

In the iteration over 'res', control is transferred back to the invoker of 
_Py_char2wchar() if a null-wchar is encountered first. If, however, a wchar 
that does satisfies the expression in Py_UNICODE_IS_SURROGATE() is encountered 
first, *tmp is not null and thus the conditional code on lines 318-320 is 
skipped.
The space that 'res' points to is unintialized. Uninitialized, however, does 
not not entail randomness in this case. If an attacker has sufficient freedom 
to manipulate the contents of the process memory prior to calling 
_Py_char2wchar() in order to scatter it with values that satisfy 
Py_UNICODE_IS_SURROGATE(), this could increase their odds of having 
_Py_char2wchar() encounter such a value

[issue23130] Tools/scripts/ftpmirror.py allows overwriting arbitrary files on filesystem

2014-12-29 Thread Guido Vranken

New submission from Guido Vranken:

Tools/scripts/ftpmirror.py does not guard against arbitrary path constructions, 
and, given a connection to a malicious FTP server (or a man in the middle 
attack), it is possible that any file on the client's filesystem gets 
overwritten. Ie,. if we suppose that ftpmirror.py is run from a base 
directory /home/xxx/yyy, file creations can occur outside this base directory, 
such as in /tmp, /etc, /var, just to give some examples.

I've constructed a partial proof of concept FTP server that demonstrates 
directory and file creation outside the base directory (the directory the 
client script was launched from). I understand that most of the files in 
Tools/scripts/ are legacy applications that have long been deprecated. However, 
if the maintainers think these applications should be safe nonetheless, I'll be 
happy to construct and submit a patch that will remediate this issue.

Guido Vranken
Intelworks

--
components: Demos and Tools
messages: 233189
nosy: Guido
priority: normal
severity: normal
status: open
title: Tools/scripts/ftpmirror.py allows overwriting arbitrary files on 
filesystem
type: security
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23130
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23055] PyUnicode_FromFormatV crasher

2014-12-15 Thread Guido Vranken

Guido Vranken added the comment:

Serhiy Storchaka: good call on changing my 'n += (width + precision)  20 ? 20 
: (width + precision);' into 'if (width  precision) width = precision;', I 
didn't realize that sprintf's space requirement entails using the largest of 
the two instead of adding the two together.

I noticed the apparently pointless width calculation in 'step 1' but decided 
not to touch it -- good that it's removed now though.

I will start doing more debugging based on this new patch now to ensure that 
the bug is gone now.

On a more design-related note, for the sake of readability and stability, I'd 
personally opt for implementing toned-down custom sprintf-like function that 
does exactly what it needs to do and nothing more, since a function like this 
one requires perfect alignment with the underlying sprintf() in terms of 
functionality, at the possible expense of stability and integrity issues like 
we see here. For instance, width and precision are currently overflowable, 
resulting in either a minus sign appearing in the resulant format string given 
to sprintf() (width and precision are signed integers), or completely 
overflowing it (ie. (uint64_t)18446744073709551617 == 1 ). Considering the 
latter example, how do we know sprintf uses the same logic?

Guido

--
nosy: +Guido

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23055
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23055] PyUnicode_FromFormatV crasher

2014-12-15 Thread Guido Vranken

Guido Vranken added the comment:

I'd also like to add that, although I agree with Guido van Rossum that the 
likelihood of even triggering this bug in a general programming context is low, 
there are two buffer overflows at play here (one stack-based and one 
heap-based), and given an adversary's control over the format and vargs 
parameters, I'd there is a reasonable likelihood of exploiting it to execute 
arbitrary code, since the one controlling the parameters has some control as to 
which bytes end up where outside buffer boundaries.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23055
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22928] HTTP header injection in urrlib2/urllib/httplib/http.client

2014-11-23 Thread Guido Vranken

New submission from Guido Vranken:

Proof of concept:

# Script for Python 2
import urllib2
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0' + chr(0x0A) + Location: 
header injection)]
response = opener.open(http://localhost:;)

# Data sent is:

GET / HTTP/1.1
Accept-Encoding: identity
Host: localhost:
Connection: close
User-Agent: Mozilla/5.0
Location: header injection



# End of script

# Python 3
from urllib.request import urlopen, build_opener
opener = build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0' + chr(0x0A) + Location: 
header injection)]
opener.open(http://localhost:;)

# Data sent is:

GET / HTTP/1.1
Accept-Encoding: identity
Host: localhost:
Connection: close
User-Agent: Mozilla/5.0
Location: header injection



# End of script

It is the responsibility of the developer leveraging Python and its HTTP client 
libraries to ensure that their (web) application acts in accordance to official 
HTTP specifications and that no threats to security will arise from their code.
However, newlines inside headers are arguably a special case of breaking the 
conformity with RFC's in regard to the allowed character set. No illegal 
character used inside a HTTP header is likely to have a compromising side 
effect on back-end clients and servers and the integrity of their 
communication, as a result of the leniency of most web servers. However, a 
newline character (0x0A) embedded in a HTTP header invariably has the semantic 
consequence of denoting the start of an additional header line. To put it 
differently, not sanitizing headers in complete accordance to RFC's could be 
seen as as virtue in that it gives the programmer a maximum amount of freedom, 
without having to trade it for any likely or severe security ramifications, so 
that they may use illegal characters in testing environments and environments 
that are outlined by an expliticly less strict interpretation of the HTTP 
protocol. Newlines are special in that they enable anyone who is able to 
influence the header
  content, to, in effect, perform additional invocations to add_header().

In issue 17322 ( http://bugs.python.org/issue17322 ) there is some discussion 
as to the general compliance to RFC's by the HTTP client libraries. I'd like to 
opt to begin with prohibiting newline characters to be present in HTTP headers. 
Although this issue is not a hard vulnerability such as a buffer overflow, it 
does translate to a potentially equal level of severity when considered from 
the perspective of a web-enabled application, for which purpose the HTTP 
libraries are typically used for. Lack of input validation on the application 
developer's end will faciliate header injections, for example if user-supplied 
data will end up as cookie content verbatim.
Adding this proposed additional layer of validation inside Python minimizes the 
likelihood of a successful header injection while functionality is not notably 
affected.

I'm inclined to add this validation to putheader() in the 'http' module rather 
than in urllib, as this will secure all invocations to 'http' regardless of 
intermediate libraries such as urllib.

Included is a patch for the latest checkout of the default branch that will 
cause CannotSendHeader() to be raised if a newline character is detected in 
either a header name or its value. Aside from detecting \n, it also breaks on 
\r as their respective implications can be similar. Feel free to adjust, 
rewrite and transpose this to other branches where you feel this is appropriate.


Guido Vranken
Intelworks

--
components: Library (Lib)
files: disable_http_header_injection.patch
keywords: patch
messages: 231590
nosy: Guido
priority: normal
severity: normal
status: open
title: HTTP header injection in urrlib2/urllib/httplib/http.client
type: security
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6
Added file: http://bugs.python.org/file37264/disable_http_header_injection.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22928
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com